APPLICATIONS OF NONLINEAR DYNAMICS TO THE TURBO DECODING ALGORITHM L. Kocarev† , Z. Tasev† and G. M. Maggio‡ ‡
†
Institute for Nonlinear Science University of California, San Diego 9500 Gilman Dr., La Jolla, CA 92093-0402 E-mail:
[email protected],
[email protected] ABSTRACT In this paper, we treat the turbo decoding algorithm as a dynamical system parameterized by the SNR (signal-to-noise ratio). A whole range of nonlinear phenomena, including chaos and transient chaos, have been shown to occur in the turbo decoding algorithm. As an application, we develop an adaptive technique to control transient chaos, leading to an improvement of the BER (bit error rate) performance. 1. INTRODUCTION In 1948, Shannon formulated the famous noisy channel coding theorem [1], which establishes the fundamental limits for digital communication in terms of channel capacity. Recently, it has been recognized that two classes of codes, namely turbo codes [2] and low-density parity-check (LDPC) codes [3, 4] perform at rates extremely close to the Shannon limit. Turbo codes were discovered by Berrou et al. in 1993 [2]. On the other hand, LDPC codes were originally introduced by Gallager [3] in 1962. The crucial innovation of LDPC codes is the introduction of iterative decoding algorithms. LDPC codes were rediscovered by MacKay et al. [4] in 1996. So far, most studies about turbo and LDPC codes have relied upon methods of information theory. However, iterative decoding algorithms can be viewed as complex nonlinear dynamical systems. Very recently, in a pioneering paper [5], Richardson has presented a geometrical interpretation of the turbo decoding algorithm and formalized it as a discrete-time dynamical system defined on a continuous set. This approach clearly demonstrates the relationship between turbo decoding and maximum-likelihood decoding. The turbo decoding algorithm appears as an iterative algorithm aimed at solving a system of 2n equations with 2n unknowns, where n is the block-length size. If the turbo decoding algorithm converges to a certain codeword, then This work was supported in part by STMicroelectronics Inc., University of California (DiMi program), and the ARO (Army Research Office).
STMicroelectronics, Inc. & Center for Wireless Communications University of California, San Diego 9500 Gilman Dr., La Jolla, CA 92093-0407 E-mail:
[email protected]
the latter constitutes a solution to this set of equations. Conversely, solutions to these equations provide fixed points of the turbo decoding algorithm, seen as a nonlinear mapping. In a follow-up of this work [6], Agrawal and Vardy parameterized the turbo decoding algorithm, as a dynamical system, in terms of the SNR. Also, in [7] it has been shown that a whole range of phenomena, known to occur in nonlinear systems, such as multiple fixed points, oscillatory behavior, bifurcations, chaos and transient chaos, can actually be observed in the turbo decoding algorithm. In this work, we consider an application of the theory developed in [7]. Namely, we use a technique for controlling transient chaos, thereby reducing the number of iterations needed by the turbo decoding algorithm to converge. 2. PRELIMINARIES ON TURBO CODES Turbo codes consist of a parallel concatenation of two recursive systematic binary convolutional codes, CC1 and CC2, separated by a random interleaver [2]. Let i be the information bit sequence of length n at the input to the turbo encoder, and let p1 (i) and p2 (i) be the parity bits produced by the first and second encoder, respectively. The input sequence of the second encoder has the same Hamming weight as i, but the bits are rearranged due to a permutation defined by the choice of the random interleaver, π. The information bit sequence, i, along with the parity bit sequences p1 (i) and p2 (i), forms a turbo codeword (i, p1 (i), p2 (i)) of length 3n. We assume that the turbo code is transmitted over an AWGN (additive white Gaussian noise) binary-input memoryless channel, using the BPSK (binary phase shift keying) modulation. We denote the transmitted turbo codeword by (s0 , s1 , s2 ) ≡ (i, p1 (i), p2 (i)). Let c0 , c1 and c2 be the channel outputs corresponding to the input sequences s0 , s1 and s2 , respectively. The turbo decoder consists of two components: a decoder SISO1 for the convolutional code CC1, and a decoder SISO2 for the
completely characterized by the channel likelihood ratios. Consequently, the mapping (1)-(2) depends on 3n parameters. After l iterations, a hard decision on the k-th bit can be made according to the sign of the log-likelihood ratio:
0
10
2−nd iteration 4−th iteration 8−th iteration 16−th iteration 32−nd iteration
−1
10
Lk (l) = log
−2
BER
10
p1i (l) 4 = X1k (l) + X2k (l) + 2 c0k p0i (l) σ
(3)
where σ 2 represents the noise variance. −3
10
3.1. A posteriori average entropy −4
10
−5
10 −0.5
0
0.5
1
1.5
2
SNR [dB]
Figure 1: BER performance of the classical turbo code with interleaver length 1024 and rate 1/3, for increasing number of iterations. The waterfall region of the turbo code corresponds approximately to the SNR region 0.25—1.25 dB. code CC2.These decoders iteratively compute the extrinsic information, Xi , of the information bits. The input for the decoder SISO1 (resp. SISO2) is the extrinsic information X2 (resp. X1 ) from the decoder SISO2 (resp. SISO1), which is used as a priori information after interleaving (resp. deinterleaving). In our analysis we consider the classical turbo code [2] with identical constituent recursive convolutional codes generated by the polynomials {D 4 + D3 + D2 + D1 + 1, D4 + 1}, producing a rate-1/3 turbo code. The length of the interleaver is n = 1024. Figure 1 shows the BER (biterror-rate) performance of the turbo code versus the SNR, expressed in dB. 3. TURBO DECODING AS NONLINEAR SYSTEM The turbo decoding algorithm may be described by a discrete-time dynamical system of the form: X1 (l + 1) = F1 [X2 (l); c0 , c1 ] X2 (l) = F2 [X1 (l); c0 , c2 ]
(1) (2)
where Fi = (F11 , . . . , F1n ) are nonlinear functions which depend on the constituent codes. Equations (1)-(2) define an n-dimensional mapping, because there are only n independent variables, namely X1 (or alternatively X2 ). Assuming that a uniform probability distribution of the a priori information is induced, the initial condition of the algorithm should be set at the origin. The quantities c0 , c1 and c2 are
Since the turbo decoding algorithm is a high dimensional dynamical system, we suggest the following representation of its trajectories in the state space. At each iteration l, the turbo decoding algorithm computes 2n values of the extrinsic information, Xi . From Eq. (3) one can compute, for each iteration, the probabilities p0k (l) and p1k (l) that the k-th bit is either 0 or 1. Let us define 1X 0 p (l) log2 p0i (l) + p1i (l) log2 p1i (l). n i=1 i n
E(l) = −
Then, E represents the a posteriori average entropy, which is a measure of the reliability of bit decisions, for an information block of size n. The advantage being that E ∈ [0, 1]. Namely, when all bits are detected correctly, pi tend to 0 or 1 for all i and, therefore, E → 0. On the other hand, when all bits are equally probable, i.e. ambiguous, E = 1. 3.2. Parameterization by the SNR As a complex dynamical system with a large number of parameters, the turbo decoding algorithm is not readily amenable for analysis. Namely, the mapping is parameterized by the noise values c01 , c02 , . . . , c0n , c11 , c12 , . . . , c1n , and c21 , c22 , . . . , c2n . To analyze the trajectories of the turbo decoding algorithm we parameterize it by the SNR 1/σ 2 [6]. For large n, typical for turbo codes, noise variance σ 2 is P 0 the 2 2 approximately equal to σ ˜ = [(ck ) + (c1k )2 + (c2k )2 ]/3n. We fix (3n − 1) noise ratios c01 /c02 , c02 /c03 , . . . , c2n−1 /c2n and treat the turbo decoding algorithm as it was depending on a single parameter 1/˜ σ 2 , which is closely related to the SNR. 3.3. Fixed Points The turbo decoding algorithm admits two types of fixed points: • Unequivocal fixed point: In this case, p0k is close to 0 or 1 for all k, so the log-likelihoods Lk (and Xik ) assume large values, and consequently E = 0. Hard decisions corresponding to unequivocal fixed point will typically form a valid codeword, usually coinciding with the transmitted codeword.
• Indecisive fixed point: This corresponds to Lk → 0, i.e., equal probability that the k-th bit is 0 or 1, hence E → 1. At this fixed point the turbo decoding algorithm is relatively ambiguous regarding the values of the information bits. Hard decisions corresponding to this fixed point typically will not form a valid codeword.
c
SISO1
g
c
3.4. Chaotic behavior The turbo decoding algorithm exhibits chaotic behavior, characterized by positive Lyapunov exponents [8], for a relatively large range of SNR values, from -6dB to 0dB [7]. There are several ways (routes) by which chaotic attractors may arise. These include infinite period-doubling cascades, intermittency, crises and torus breakdown [8]. In the waterfall region, see Fig. 1, the turbo decoding algorithm converges either to the chaotic invariant set or to the unequivocal fixed point, after a long transient behavior indicating the existence of a chaotic non-attracting invariant set in the vicinity of the unequivocal fixed point. In some cases, the algorithm spends a few thousand iterations before reaching the fixed point solution. As SNR increases, the number of iterations decreases. Table 1 reports the error frame statistics as a function of the number of iterations, for different values of SNR. We emphasize that the transient chaos behavior is characteristic of the waterfall region of the turbo code. In what follows, we will see how this fact can be exploited to enhance the performance of turbo codes. 4. APPLICATION: ULTRA-FAST CONVERGENCE In this section we consider an application of nonlinear control theory in order to speed-up the convergence of the turbo decoding algorithm. Iterations/SNR it ≤ 5 5 < it ≤ 10 10 < it ≤ 20 20 < it ≤ 50 50 < it ≤ 100 100 < it ≤ 200 200 < it ≤ 500 500 < it ≤ 1000 1000 < it ≤ 2000 2000 < it
0.2 6 10.1 4.9 5.1 4.9 3 4 2.9 3.9 55.2
0.4 19.3 22.6 9.3 7.5 4.7 4 4.3 2.9 2.5 22.9
0.6 49.2 24.3 8.4 4.5 2.1 1.9 1.8 1 1 5.8
0.8 77.2 15.2 3.5 1.2 .5 .4 .5 .3 .2 1
1.0 92.7 5.6 .8 .5 .1 1 .1 0 0 .1
Table 1: Table characterizing the percentage (%) distribution of the frames with errors, as a function of the number of iterations, for different values of SNR.
g
SISO2
c
Figure 2: Block diagram of the turbo decoder including the control of transient chaos: SISO1 and SISO2 are softinput/soft-output decoders, Xi are extrinsic information; cj is the channel output corresponding to the input sj ; g is the control function (g is the identity for classical turbo codes). Namely, we have developed a simple adaptive control mechanism to reduce a long transient behavior in the algorithm. A schematic block diagram of the turbo decoder including the adaptive control is reported in Fig. 2. Our control algorithm reads: gi (Xi ) = αe−β|Xi | Xi , where α and β are parameters. In our simulation, we selected α = 0.9 and β = 0.01. Recall that gi are n-dimensional vectors: gi = (gi1 , . . . , gin ). The intuition behind the control strategy can be explained as follows. The probability p0k (l) that k-th bit is 0 is given by: p0k (l) =
1 , 1 + e Lk
where: Lk = X1k + X2k + σ42 c0k . Assume now that, for example, Lk is a small number. Then, correspondingly, gik is close to 1 (α is close to 1 and β is small). In other words, the control algorithm does nothing. If however, Lk is large, then the control algorithm reduces the values of gik . If the k-th bit is part of a valid codeword, the control algorithm does not affect the turbo decoding algorithm: the turbo decoding algorithm makes a decision for the k-th bit, in the above example, with a probability which is close to 1, with or without control. However, if the k-th bit is not a part of a valid codeword, and turbo decoding algorithm “struggles” to find the valid codeword, the attenuation effect of gik helps a great deal in reducing the long transient
(a)
0
10
100
8−th iteration, without control 32−nd iteration, without control 8−th iteration, with control 32−nd iteration, with control
% of chaotic frames
SNR = 0.4 dB
−1
10
−2
10
60 40 20
BER
0
5
10
20
50 100 200 500 1000 2000 No. of iterations
−3
(b)
10
25 % of chaotic frames
SNR = 0.8 dB −4
10
−0.25
without control with control
80
0
0.25
0.5 0.75 SNR [dB]
1
1.25
20
without control with control
15 10 5
1.5
Figure 3: The performance of the classical turbo code (interleaver length 1024) with and without control of the transient chaos, for 8 and 32 iterations.
behavior in the turbo decoding algorithm. For instance, at SNR=0.8 dB, we found that the average chaotic transient lifetime, with control, reduces from 378 to only 9 iterations. The results of the above control strategy are summarized in Fig. 3. On average, the turbo decoding algorithm with control exhibits a gain of 0.25–0.3 dB versus the classical turbo decoding algorithm (without control). We can observe from Fig. 3 that the control is not very effective in the error-floor region, where the iterative decoding process does not exhibit phenomena of transient chaos. The error frame statistics with/without control, as a function of the SNR, are reported in Fig. 4. Finally, we stress that our adaptive control algorithm is very simple, and can be easily implemented (both in software and hardware) without significantly increasing the complexity or slowing down the decoding algorithm.
5. CONCLUSIONS The turbo decoding algorithm can be viewed as a highdimensional dynamical system parameterized by a large number of parameters. The turbo decoding algorithm has been shown to exhibit a whole range of phenomena known to occur in nonlinear systems. These include the existence of multiple fixed points, oscillatory behavior, bifurcations, chaos and transient chaos. In this work, as an application of the theory developed, we apply a simple technique for controlling transient chaos, thus reducing the number of iterations needed by the turbo decoding algorithm to converge.
0
5
10
20
50 100 200 500 1000 2000 No. of iterations
Figure 4: Frames remaining chaotic after a certain number of iterations, with/without control of transient chaos: (a) SNR=0.4 dB, (b) SNR=0.8 dB. 6. REFERENCES [1] C. E. Shannon, “A mathematical theory of communication,” Bell Systems Technical Journal, vol. 27, pp. 379– 423, pp. 623-56, (1948). [2] C. Berrou, A Glavieux, and P. Thitimajshima, “Near Shannon limit error-correcting coding and decoding: Turbo-Codes,” Proc. IEEE International Communications Conference, pp. 1064–70, (1993). [3] R.G. Gallager, “Low Density Parity Check Codes,” IRE Trans. Inf. Theory, vol. 8, pp. 21–28, (1962). [4] D.J.C. MacKay and R.M. Neal, “Near Shannon limit performance of Low Density Parity Check Codes,” Electronics Letters, vol. 32, pp. 1645–46, (1996). [5] T. Richardson, “The geometry of turbo-decoding dynamics,” IEEE Trans. Information Theory, vol. 46, pp. 9-23 (2000). [6] D. Agrawal and A. Vardy, “The turbo decoding algorithm and its phase trajectories,” IEEE Trans. Information Theory, vol. 47, No. 2, pp. 699–722 (2001). [7] Z. Tasev, L. Kocarevy, F. Lehmann and G.M. Maggio, “Nonlinear phenomena in the turbo decoding algorithm,” Proc. NOLTA 2002, Xian, PRC, October 711, 2002. [8] E. Ott, Chaos in dynamical systems, Cambridge University Press, New York, (1993).