Iterative Min-Sum Decoding of Tail-biting Codes1 Srinivas Aji, Gavin Horn, Robert McEliece, and Meina Xu Department of Electrical Engineering, California Institute of Technology, Pasadena, California 91125, USA e-mail: mas, gavinh, rjm,
[email protected] Abstract: By invoking a form of the Perron-Frobenius theorem for the \min-sum" semiring, we obtain a union bound on the performance of iterative decoding of tailbiting codes for both the AWGN channel and the BSC. This bound shows that iterative decoding will be nearly optimum, at least for high SNRs, if and only if the minimum \pseudodistance" of the code is larger than the ordinary minimum distance. We prove that the minimum pseudodistance of any 2-segment \pseudocodeword" must be at least the ordinary minimum distance. Unfortunately, the minimum pseudodistance of a 3(or more)-segment pseudocodeword can be less than the ordinary minimum distance. We will present a tail-biting trellis for the (24; 12; 8) Golay which has a pseudoweight less than the minimum distance of the code.
1 Introduction Because of the remarkable success of the iterative turbo-decoding algorithm [7], many coding researchers have been focusing on the study of other sub-optimal iterative decoding algorithms. Perhaps the simplest such algorithm is the iterative decoding of tail-biting codes. In this paper we show that iterative min-sum decoding of a tail-biting code will be eective if and only if the minimum \pseudoweight" of the code is strictly greater than its ordinary minimum weight. Closely related results were discovered independently by Wiberg in his thesis [15] and by Forney et al, in [10]. This paper is an extension of [2]. Besides containing some new results on pseudoweights, this paper also includes results for the BSC as well.
2 Perron-Frobenius Theorem for the min-sum semiring In this section we will state without proof a \Perron-Frobenius" theorem for the min-sum semiring, which explains the behavior of the iterative decoding algorithm to be presented in section 4. (Cf. the usual \sum-product" P.-F. theorem, e.g. [11, Theorem 4.5.12]). Let A be an s s irreducible matrix with entries from R [ f1g, with rows and columns indexed by f1; 2; : : : ; sg, and let G be the corresponding weighted digraph. Assume that among all simple closed paths in G, there is a unique one with minimum average edge weight, and that this \critical cycle" is in fact a self-loop of weight at vertex 1. (We summarize this condition by saying that A has a \simple eigenvalue.") Then for n suciently large, with min-sum arithmetic, An = nE: This work was partially supported by NSF grant no. NCR-9505975, AFOSR grant no. 5F49620-971-0313, and a grant from Qualcomm, Inc. Gavin Horn's contribution was also supported by an NSERC scholarship. 1
Here E is a xed s s \rank one" matrix, i.e., Ei;j = xiyj ; where x = (x1; : : : ; xs) and y = (y1; : : : ; ys) are right and left \eigenvectors" for A with corresponding \eigenvalue" . The numbers xi and yj have the following geometric interpretation. If we reduce each entry in A by , and if we denote this modi ed matrix by A0 and the corresponding weighted digraph by G0, then xi is the least weight of any path in G0 from vertex i to vertex 1, and yj is the least weight of any path from vertex 1 to vertex j .
3 Tail-Biting Codes and Pseudo-Codewords In this section we give a brief introduction to tail-biting codes. For further details, we refer the reader to [8]. A tail-biting trellis is a nite, labeled, digraph in which the vertices are partitioned into n classes, 0; : : : ; n?1, each class being indexed by an element of Zn = f0; 1; : : : ; n? 1g, the cyclic group of order n. (All index arithmetic is done modulo n.) If E is an edge, we denote the initial vertex of E by init(E ) and the nal vertex of E by n(E ). An edge E must have init(E ) 2 k and n(E ) 2 k+1, for some k 2 Zn. The label of such an edge, denoted out(E ) (for \output"), belongs to a nite alphabet Ak . If P is a path, the label of P , denoted out(P ), is the concatenation of the labels of the edges comprising P . We call out(P ) the output of the path P . An L-segment tail-biting path P is a trellis path of length Ln for which init(P ) = n(P ). The code generated by the tail-biting trellis is the set of outputs of the the onesegment tail-biting paths. A pseudocodeword is the output of any tail-biting path of 1 or more segments.
4 Iterative Decoding of Tail-biting Codes The iterative min-sum decoding algorithm for tail-biting codes is discussed explicitly in [13, 5, 9, 14]. Our view is that it is an application of the Generalized Distributive Law [4, 1], as applied to a junction graph with just one cycle [3]. In any case, if y is the received noisy codeword, after a nite number of iterations, the decoder will \lock on" to the pseudocodeword nearest to y, which is called the dominant pseudocodeword in [9]. This follows from the min-sum Perron-Frobenius theorem (alternatively see [14] or [9]). Here the appropriate matrix A has entry ai;j given by ai;j = minfp(yjout(P ) : init(P ) = i; n(P ) = j g: A ML decoder will compute mini fai;ig, since that corresponds to the most likely tail-biting codeword. On the other hand, a two-way iterative min-sum decoding algorithm will converge after a nite number of iterations, to the same result, provided A has a simple eigenvalue. In coding terms, this condition amounts to saying that there is a unique nearest pseudocodeword to y, which is in fact a codeword. This fact allows us to bound the probability of decoder error, using the familiar union bound argument.
5 The Union Bound for Iterative Decoding on the AWGN Channel In this section we restrict attention to binary linear (tail-biting) codes, being used with BPSK modulation on an additive white Gaussian channel. We will use the insights gained in the previous section (the decoder converges to the nearest pseudocodeword) to obtain a union bound on the decoder word error probability. Let x = (x1; : : : ; xL ) be an L-segment (0; 1) pseudocodeword, s = (s1 ; : : : ; sLp) be the transmitted signal of x where 0 is mapped into 0 and 1 is mapped into 2 Es . Assume the all zero codeword 0 is transmitted,Pand r is received. Then a decoder error L jr ? s j2 jrj2 ; or equivalently, when 1 occurs, i.e., the decoder prefers x to 0 when i i=1 L PL PL T 2 2( i=1 si) r > i=1 jsi j : Hence the probability of decoding error is PL
2
=1
p
!
i jsi j = ; Q P 2j Li si j
(1)
=1
R
column sum) is dewhere Q(t) = (1P = 2) t1 e?s2 =2ds, 2 = N0=2.PIf cj (the j th PL P L L 2 ned to be cj = i=1 xi;j for j = 1; : : : ; n, then i=1 jsi j = ( j cj )4Es , j i=1 si j = q P ( j c2j )4Es, and eq. (1) becomes
Q
s P
(
cj ) Pj 2 j cj
2
!
2Es =N0 :
(2)
In view of (2), we de ne the pseudoweight w(x) to be
w(x) =
P
(
j cj ) P 2 j cj
2
:
Thus for example the three-segment pseudocodeword (0000) (0101) (0011) has c1 = 0, c2 = c3 = 1, and c4 = 2, so that its pseudoweight is (0+1+1+2)2=(02 +12 +12 +22) = 8=3. Note that the pseudoweight of an ordinary (1-segment) codeword is the same as its weight as usually de ned. Let C denote the set of all codewords, and P denote the set of all simple pseudocodewords (a simple pseudocodeword is one whose underlying path does not pass through the same vertex twice). The above argument implies a union bound on the (iterative min-sum) decoder word error probability PEIT (here for completeness we have included the ordinary union bound on PEML , the maximum-likelihood word error probability):
PE IT
PE
ML
X
x2P
X
x2C
p
Q( 2Rw(x)Eb =N ); 0
p
Q( 2Rw(x)Eb =N ) 0
(3) (4)
In general it is not easy to compute the pseudoweight-enumerator for a given code. However, in the next section we will do so for the (8; 4; 4) Hamming code.
6 The (8 4 4) Hamming Code on AWGN Channel ;
;
In [8, section 5.2], an optimal tail-biting trellis for the extended (8; 4; 4) binary Hamming code is constructed, with state-complexity pro le (2; 4; 4; 4; 2; 4; 4; 4). We have used this trellis to experiment with the iterative min-sum decoding algorithm. In Figure 1, we have plotted the actual performance (bit error probability) of an ML decoder and an iterative min-sum decoder ( ve iterations) for the (8; 4; 4) Hamming code for values of Eb =N0 ranging from 0 dB to 9 dB in increments of 0:5 dB. We see no measurable dierence in performance, although we know that theoretically the iterative algorithm is not as good as ML because of the presence of pseudocodewords. 2
10
0
10
−2
10
−4
BER
10
−6
10
−8
10
−10
10
union bound w/o pseudocodewords union bound w/ pseudocodewords ML decoding Iterative decoding(5 iterations)
−12
10
−14
10
−2
0
2
4
6
8
10
12
Eb/No(dB)
Figure 1: (8,4,4) tail-biting Hamming code union bound on AWGN channel The pseudoweight enumerator for the (8; 4; 4) Hamming code, as represented by the minimal tail-biting trellis from [8] is given in the table below. In the rst row is the ordinary weight enumerator, i.e., a list of the weights of the codewords (the one-segment pseudocodewords). In the second row is the pseudoweight enumerator for the 64 twosegment pseudocodewords. (There can be no simple pseudocodewords with more than two segments, because the trellis has state complexity 2 at two indices.) 0 1 2 3 4 4 21 5 6 6 14 7 7 172 8 W.E 1 14 1 P.W.E 32 30 2 In Figure 1, we have plotted the bounds in eqs. (4) and (3), using the data from the table, modi ed to give bounds on bit error probability. These bounds are asymptotically equal. This is because the leading term in both bounds is the same, because the minimum pseudoweight (in this case 4:5) is strictly larger than the minimum weight (in this case 4).
7 Pseudoweights on the AWGN Channel Because of the union bound in eq. (3), if the minimum pseudoweight is greater than the minimum weight for a given code, then the performance of iterative min-sum decoding will be asymptotically the same as ML decoding. The problem remains to nd the pseudoweights. The following lemma shows that any 2-segment pseudocodeword has pseudoweight at least the minimum weight when the 2 segments are not identical.
Lemma 1 Let x and x be n dimensional vectors in Euclidian space. Then jx j + jx j jx ? x j: jx + x j 1
2
1
2
1
2
2
1
2
2
X1 + X2 X1 X1 + X 2 X2
X2
X1
X1 ? X 2 X1
Figure 2: Proof: 2jx1 ? x2 jjx1 + x2j jx1 ? x2 j2 + jx1 + x2j2 = 2jx1j2 + 2jx2j2 (see Figure 2 for the last equality). 2
Theorem 1 The pseudoweight of any 2-segment nonzeropseudocodeword must be at least
the minimum weight if the 2 segments are not equal.
Proof: From 2 the 2de nition of pseudoweight, a 2-segment w(x) can be expressed as: p w(x) = jxj1xj1++jxx22jj jx1 ? x2j (the last inequality is by the Lemma). But x1 ? x2 is an ordinary (tail-biting) codeword, and jx1 ? x2j2 is the codeword weight. 2 If we assume the tail-biting trellis is \n-observable", i.e., that P can be inferred from out(P) for path of length n, then even if x1 = x2, the conclusion of Theorem 1 holds. For tail-biting trellises where the minimum number of states is greater than 2, the 2segment pseudocodewords are not all the pseudocodewords. However, the pseudoweight of a 3(or more)-segment pseudocodeword can be less than the minimum distance. Koetter and Vardy have shown an example in which the pseudoweight of a 3-segment pseudocodeword is less than the pseudocodewordof the code[12]. Anderson directed us to a 64-state tail-biting trellis for the extended (24; 12; 8) Golay code [6], and we have found that this tail-biting trellis has a 4-segment pseudoweight of weight 7:36, which is less than the minimum distance 8 of the code. In [8], an optimal 16-state tail-biting trellis is constructed for the same Golay code. (For this 16-state tail-biting trellis, we have found that the 3-segment and 4-segment pseudocodewords are at least the minimum distance.)
0
10
−1
10
−2
BER
10
−3
10
−4
10
−5
10
ML decoding Iterative decoding(16−state tail−biting trellis) Iterative decoding(64−state tail−biting trellis)
−6
10
0
1
2
3 Eb/No(dB)
4
5
6
Figure 3: (24,12,8) tail-biting Golay code performance on AWGN channel We have plotted the min-sum iterative decoding performance of both trellises as well as the ML decoding performance in Figure 3. While there is not much dierence between the performance of the ML decoding and the min-sum iterative decoding of the 16-state tail-biting trellis, there is about 0:5 dB dierence between the ML decoding and the min-sum iterative decoding of the 64-state tail-biting trellis, due to the low weight pseudocodeword.
8 The Union Bound for Iterative Decoding on the BSC
With x = (x1; : : : ; xL ) and c = (c1; : : : ; cn) as de ned in Section 5, we will obtain a union bound on the decoder word error probability for BSCs. Let the all zero codeword 0 be transmitted, and let y = (y1; : : : ; yn ) be the received version of x. Then for a general two-input memoryless channel, a decoding error occurs if L Y i=1 n Y
(
X
i:yi =1
ci ?
X
j :yj =0
i=1
P (yjxi ) P (yj0)L
P (yij1)ci
n Y i=1
0: cj ) log PP ((yyijj1) 0) i
P (yij0)ci (5)
On a BSC with crossover probability p(p < 1 ? p), eq. (5) becomes X X c ) log 1 ? p 0 c? ( i:yi =1
i
j :yj =0 X
i:yi =1
j
p
ci ?
X
j :yj =0
cj 0:
(6)
Thus the decoder word error probability PEITBSC is:
PEITBSC
n X i=1
q(c; i)pi(1 ? p)n?i;
(7)
where q(c; i) is the number of ways of choosing yi = 1 such that eq. (6) is true. Thus for example, for c = (12121211), the q(c; i)'s are given in the table below.
q(c; i)
i
0 1 2 3 4 5 6 7 8
0 0 ?0
?3?5 3
?3?5
1
3 3
+
?3?5 2
?3?5
2
? ?
+ 2 3 + 31 54 2 3 ?3?5 ?3?5 ?3?5 + 2 4 + 1 5 3 3 ?3?5 ?3?5 + 2 5 3 4 ?3?5 3
5
Note that the decoder word error probability PEITBSC union bound in eq. (7) is the same as the usual word ? error probability on BSC for ordinary codewords where cj 2 f0; 1g, and q(c; i) = ni . For pseudocodewords, cj is a measure of the weight of bit j such that an evil channel may choose to ip the bit j which has large cj . Note that the min-sum iterative decoding union bound for the AWGN channel eq. (3) can also be derived from eq. (5) with the AWGN channel conditional probabilities ? (y ?1)2 ? (y +1)2 1 1 P (yj0) = pNo exp No ; and P (yj1) = pNo exp No :
9 The (8 4 4) Hamming Code on the BSC ;
;
For the same (8; 4; 4) Hamming code tail-biting trellis as in Section 6, there are 64 pseudocodewords, and the c's are in one of the following three forms or a permutation of them: c1 = (12121211), c2 = (01011210), and c3 = (01211212). The corresponding q(cj; i)'s are given in the table below.
i q(c1; i) q(c2; i) q(c3; i)
0 1 2 3 4 5 6 7 8
0 0 0 1 35 55 28 8 1
0 0 5 10 5 56 28 8 1
0 0 0 13 34 21 28 8 1
1
10
0
10
−1
10
−2
BER
10
−3
10
−4
10
−5
10
union bound w/o pseudocodewords union bound w/ pseudocodewords ML decoding Iterative decoding(5 iterations)
−6
10
−7
10
−2
0
2
4
6
8
10
12
Eb/No(dB)
Figure 4: (8,4,4) tail-biting Hamming code union bound on BSC In Figure 4, we have plotted the bounds with and without pseudocodewords in eq.( 7), using the data from the table. pseudocodewords, the word error? probability is ?4 3 ?4 Without 2 2 ? 15Q4 + Q8, where Q4 = 2 ?p(1 ? p) + 3 p (1 ? p) + p4; and Q8 = 84 p4(1 ? p)4 + ? 8 p5(1 ? p)3 + 86 p6(1 ? p)2 + 87 p7(1 ? p)+ p8: Because the leading term in both bounds 5 is the same, these bounds are asymptotically equal.
10 Pseudoweights on the BSC
De ne ^{ as the minimum of i such that q(c; i) 6= 0 over all i and c, and the \minimum BSC pseudoweight" as 2^{. If the minimum BSC pseudoweight is at least as large as the minimum Hamming weight of the code, then the iterative min-sum asymptotic performance is not degraded. We have the following theorem for pseudoweight of any 2-segment pseudocodewords. Theorem 2 The minimum BSC pseudoweight of any nonzero 2-segment pseudocodeword must be at least the minimum weight if the 2 segments are not identical.
Proof: WLOG, rearrange the pseudocodeword fragments x1 and x2 so that
x = 1:::1 0:::0 1:::1 0:::0 x = 1:::1 1:::1 0:::0 0:::0 x x = 0| :{z: : 0} 1| : : : 1{z1 : : : 1} 0 : : : 0 1
2
1
2
n1
n2
n1
n2
c = 2| :{z: : 2} 1| : : : 1{z1 : : : 1} 0 : : : 0; where c is the column sum(ordinary sum) of the pseudocodeword fragments. Let dmin denote the ordinary minimum distance, and ^{ as de ned in the beginning of the section. Since x1 x2 is an ordinary codeword, n2 dmin. Hence, it is suce to show ^{ d n22 e. There are two cases to be considered.
Case 1: n d n e. 1
2
2
2^{ 1 (2n1 + n2) 2 d n22 e + n22 :
Thus, ^{ d n22 e. Case 2: n1 < d n22 e. Let integer k such that k + n1 = ^{. Then,
Thus, ^{ d n22 e. 2
2n1 + k 1 (2n1 + n2) 2 n1 + k n22 :
Again, the pseudoweight of an 3(or more)-segment pseudocodeword can be less than the minimum distance. In particular, the two tail-biting trellises in section 7 have pseudoweight 6, which is less than the minimum distance 8 of the Golay code.
11 Conclusions The results in this paper strongly suggest that the excellent experimental performance reported for iterative min-sum decoding of tail-biting codes is due to the fact that the minimum pseudoweight of most tail-biting codes is strictly larger than the ordinary minimum weight. It remains a challenging problem to produce a practical algorithm for computing the minimum pseudoweight. In any case, because of the union bound, we can say with con dence that if, for a given code, the minimum pseudoweight is indeed greater than the minimum weight, then the performance of iterative min-sum decoding will be asymptotically the same as ML decoding.
References [1] S. M. Aji, Graphical Models and Iterative Decoding. PH.D. thesis, Caltech, 1998. [2] S. M. Aji, G. B. Horn, R. J. McEliece, and M. Xu, \Iterative min-sum decoding of tail-biting codes," Proc. IEEE Information Theory Workshop, Killarney Ireland, June 1998. pp. 68-69. [3] S. M. Aji, G. B. Horn, and R. J. McEliece, \On the convergence of iterative decoding on graphs with a single cycle," Proc. CISS 1998(Princeton, N.J., March 1998). [4] S. M. Aji and R. J. McEliece, \The generalized distributive law," preliminary versions presented at ISIT97 and ISCTA97. Current version available at www.systems.caltech.edu/EE/Faculty/rjm/, submitted to IEEE Trans. Inform. Theory. [5] J. B. Anderson and S. M. Hladik, \Tail-biting MAP decoders," IEEE J. Select. Areas Comm., vol. 16, no. 2 (Feb. 1998). [6] J. B. Anderson, personal communication. [7] C. Berrou, A. Glavieux, and P. Thitimajshima, \Near Shannon limit error-correcting coding and decoding: Turbo-codes." Proc. IEEE Int. Comm. Conf., (Geneva, Switzerland, 1993), pp. 1064{1070. [8] A. R. Calderbank, G. D. Forney, Jr., and A. Vardy, \Minimal tail-biting trellises: the Golay code and more." submitted to IEEE Trans. Inform. Theory. [9] G. D. Forney, Jr., F. R. Kschischang, and B. Marcus, \Iterative decoding of tail-biting trellises," presented at 1998 Information Theory Workshop. San Diego: Feb. 9{11, 1998, pp. 11-12. [10] G. D. Forney, Jr., F. R. Kschischang, and A. Reznik, \The eective weight of pseudocodewords for codes de ned on graphs with cycles on AWGN channels," Preprint. [11] D. Lind and B. Marcus, Symbolic Dynamics and Coding. Cambridge, England:Cambridge University Press, 1995. [12] R. Koetter, and A. Vardy, personal communication. [13] G. Solomon and H. C. A. Van Tilborg, \A connection between block and convolutional codes." SIAM J. APPL. MATH., pp. 358{367, October 1979. [14] Y. Weiss, \Correctness of local probability propagation in graphical models with loops", submitted to Neural Computation. [15] N. Wiberg, Codes and Decoding on General Graphs. Ph.D. dissertation, Department of Electrical Engineering, U. Linkoping, Sweden (April 1996).