ISIT 2006, Seattle, USA, July 9 14, 2006
Block Turbo Codes: From Architecture to Application B. Geller∗‡ , I. Diatta∗ , J. P. Barbot‡ , C. Vanstraceele‡ and F. Rambeau‡ ‡ SATIE
ENS Cachan, 94235 Cachan, France Email: geller,barbot,vanstraceele,
[email protected] ∗ IUT-GTR Universit´e Paris 12, 94010 Cr´eteil, France Email:
[email protected] Abstract— This paper is concerned with the design of block turbo codes for very high bit rate applications. We first introduce an original low-complexity architecture designed for the iterative decoding of product codes. The implementation of the ChasePyndiah algorithm is simplified by not memorizing any of the list decoding concurrent code words. We then illustrate that such block turbo codes allow some bit rate improvement in the context of local loop transmission.
I. I NTRODUCTION Convolutional turbo-codes allow performance close to Shannon’s theoretical limit. Block Turbo Codes (BTCs), i.e. product block codes with iterative column and row decodings, also achieve near-capacity decoding with high coding rates [1]. Pyndiah [2] has devised a low-complexity block turbo decoding algorithm which has two appealing features for high bit rate applications; first, it has inherited from the low complexity of algebraic decoders; furthermore the turbo process converges in only four iterations. Each iteration m of this algorithm can be summarized as follows: • For each row (or column) of an n × n product word, perform a Chase search [3] in order to have a list of concurrent code words and a decided code word D = (d0 , . . . , dj , . . . , dn−1 ) with dj ∈ {−1, +1} • Evaluate a vector of decision reliabilities Λ = (Λ(d0 ), . . . , Λ(dj ), . . . , Λ(dn−1 ) in terms of normalized log-likelihood ratio (LLR); for each position j (j < n), the reliability of each decision is: 2 2 R − C −1(j) − R − C +1(j) (1) Λ(dj ) ≈ 4
•
•
where C −1(j) (respectively C +1(j) ) is the code word in the Chase list at minimum Euclidean distance of the updated received code word R = (r0 , . . . , rj , . . . , rn−1 ), such that c−1(j) = −1 (respectively c+1(j) = +1). One of these two code words is equal to D and the other is then referred as C D Obtain for each position j the extrinsic information wj : Λ(dj )rj if a competing C D exists, wj = (2) β(m)dj if no competing word exists Once the extrinsic information has been evaluated, update the input to the next Soft-Input/Soft-Output (SISO)
1424405041/06/$20.00 ©2006 IEEE
decoding stage with: rj = rj + α(m)wj
(3)
where α(m) is a weight factor increasing along the convergence process. There were several refinements to Pyndiah’s original work. For instance, [4] (respectively [5]) proposes to update the algorithm without the need of using any of the coefficients α(m) in equation (2) (respectively β(m) in equation (3)). [6] considers parallel rows and columns decoding of the product code in order to half the latency of the decoder. [7] presents a low-complexity Chase decoder. In section II, we show that it is absolutely unnecessary to put any of the Chase competing words into memory and also that at each iteration, one can readily obtain the output of the SISO decoder. In section III, we illustrate that such a BTC provides appreciable bit rate and range improvement in the context of Very High bit rate Digital Suscriber Line (VDSL) systems. II. S IMPLIFICATION OF THE D ECODER ’ S A RCHITECTURE A. Principle We consider the decoding process of a given updated received n−1 (line or column) vector R of length n. Let P (C) = j=0 rj cj be the ordinary scalar product of R with a code word C. A code word is at minimum distance of R if, and only if, its scalar product is maximum. Equation (1) can be written as the difference between two scalar products: Λ(dj ) = 0.5(P (C +1(j) ) − P (C −1(j) ))
(4)
This illustrates that instead of memorizing code words to evaluate the reliabilities, it is sufficient to memorize scalar products [8], [9]. One then observes that one such memorized scalar product can be used several times during the updating of a vector of reliabilities if the corresponding code word is at minimum Euclidean distance for several different coordinates. More precisely, a given scalar product will be used every time that the corresponding code word is one of the possibly two minimum distance competing code words. For instance, for every coordinate, the scalar product P (D) of the decided code word D will necessarily be one of the two competing scalar product of equation (4) as the decoded word D is at minimum Euclidean distance of R.
1813
ISIT 2006, Seattle, USA, July 9 14, 2006
We thus adopt the following strategy requiring only two shift registers of n real values. + SP + = (SP0+ , . . . , SPj+ , . . . , SPn−1 ) designates a shift register containing different Scalar Products P (C +1(j) ); − SP − = (SP0− , . . . , SPj− , . . . , SPn−1 ) designates a shift register containing different Scalar Products P (C −1(j) ). 1) Initialization: SP + and SP − are both initialized at the minimum possible values −∞. 2) First Step: Let C 1 = (c1,0 , . . . , c1,j , . . . , c1,n−1 ) be the first Chase decoded code word and P (C 1 ) its scalar product. J1+ = {j ∈ N, j < n/c1,j = +1}, J1− = {j ∈ N, j < n/c1,j = −1}. For j ∈ J1+ , SPj+ = P (C 1 ). For j ∈ J1− , SPj− = P (C 1 ).
Λ = =
0.5(SP + − SP − ) (+2.0, +0.9, +2.0, +0.9, −0.9, +1.7, +0.9)
We notice that the maximum scalar product vector C 3 = D is such that: C 3 = (sign[Λ(d0 )], . . . , sign[Λ(dj )], . . . , sign[Λ(dn−1 )]) C. Further Discussion
3) Following Steps: For each different Chase decoded C k = (ck,0 , . . . , ck,j , . . . , ck,n−1 ) with scalar product P (C k ), Jk+ = {j ∈ N, j < n/ck,j = +1}, Jk− = {j ∈ N, j < n/ck,j = −1}. For j ∈ Jk+ , if P (C k ) > SPj+ then SPj+ = P (C k ). For j ∈ Jk− , if P (C k ) > SPj− then SPj− = P (C k ). 4) Result: Λ
4) Results:
= 0.5(SP + − SP − ) = (Λ(d0 ), . . . , Λ(dj ), . . . , Λ(dn−1 ))
(The indexes where there is no competing code word C D are simply β positions – see equation (2) or reference [4]) B. Example Let us illustrate the previous procedure with a simple example. Let C i , be the three code words generated by a Chase search with their scalar product P (C i ), (i ∈ {1, 2, 3}) evaluated by the decoder. 1) First Step:
The previous procedure simplifies the whole architecture of a BTC decoder. Not only there is no need to put into memory the competing code words, but also, with the simplification of the reliabilities evaluation, many of the existing functionalities of a BTC decoder [10] disappear as, for instance, the search for every bit of a relevant competing code word. It is even unnecessary to identify which is the maximum likelihood vector D among the competing code words. One can take advantage of the reduction in architecture complexity to use more test patterns dedicated to the Chase search. The following figure illustrates in the case of a BCH(128,113)2 product code the Bit Error Rate (BER) improvement at the fourth iteration when one increases the number of least reliable bits used for the generation of the test patterns from p = 4 to p = 6. A gain of more than 0.3 dB is then observed at a BER of 10−3 . One can measure [9] that this code of coding rate R = 0.78 is at 1.3 dB from the Shannon’s capacity in only 4 iterations. The proposed procedure is particularly light when used in conjunction with a Fast Chase algorithm [7]; this is because if two code words C and C differ just in one position k (ck = −ck ), their scalar products are linked by the equality P (C ) = P (C) − rk ck .
C 1 = (−1, 1, −1, 1, 1, −1, 1), P (C 1 ) = 1.4 SP + = (−∞, 1.4, −∞, 1.4, 1.4, −∞, 1.4) SP − = (1.4, −∞, 1.4, −∞, −∞, 1.4, −∞) 2) Second Step: C 2 = (1, −1, 1, −1, 1, 1, −1), P (C 2 ) = 3.6 SP + = (3.6, 1.4, 3.6, 1.4, 3.6, 3.6, 1.4) SP − = (1.4, 3.6, 1.4, 3.6, −∞, 1.4, 3.6) 3) Third Step: C 3 = (1, 1, 1, 1, −1, 1, 1), P (C 3 ) = 5.4 SP + = (5.4, 5.4, 5.4, 5.4, 3.6, 5.4, 5.4)
Fig. 1. Bit Error Rates at iteration m = 4 for different numbers of tested bits using QPSK signalling on an AWGN channel
SP − = (1.4, 3.6, 1.4, 3.6, 5.4, 1.4, 3.6)
1814
ISIT 2006, Seattle, USA, July 9 14, 2006
TABLE I
III. B LOCK T URBO C ODES F OR THE V ERY H IGH B IT R ATE L OCAL L OOP
C OMPARAISON OF THE C ODING G AINS ( IN D B) FOR VARIOUS C ONSTELLATION S IZES
A. Generalities On Very High Bit Rate Digital Subscriber Line Systems Orthogonal Frequency Division Multiplexing (OFDM) systems have found their way into many high bit rates applications as they exhibit high spectral efficiency with a low complexity implementation. Discrete Multi-tone Transmission (DMT) can be seen as a variation of OFDM that can be employed when a feedback channel gives some Channel State Information (CSI) such as every sub-band received Signal to Noise Ratios (SNRs). This allows the transmitter to adjust each carrier information according to the capacity allowed in the considered sub-band. It has been implemented with success in the case of xDSL (Digital Subscriber Line) systems for the wired local loop. The aim of the xDSL technologies is to provide wideband and variable bandwidth data over existing copper line with minimal expense. Nowadays, the ADSL technology, with bit rates up to a theoretical 16 Mbit/s on a distance up to 5 kilometers, allows operators to propose multimedia services, fast internet access and video phoning. The next step is the Very high bit rate DSL (VDSL) technology which should provide data rates up to 56 Mbit/s on distances up to 1.5 kilometer between the optical network and the final user. ADSL and VDSL technical principles are detailled in [11], [12], [13]. In the case of VDSL, there are 2783 carriers and the space between adjacent carriers is 4.3125 kHz. The transmitted powers are limited to 14.5 dBm for the downstream and to 11.5 dBm for the upstream. The choice of the constellation (2QAM to 32768QAM) is given by a bitloading algorithm taking into account the SNR (Signal to Noise Ratio) for the considered carrier. The number of bits (which can take values from 0 up to 15) on the ith carrier can be obtained by using the following expression: G.SNR(i) ) (5) b(i) = log2 (1 + Γ where G is the coding gain, Γ is a constant which depends on the allowed BER (for instance, Γ= 9.6 dB for the target BER = 10−6 ) and SNR(i) is the Signal to Noise Ratio on the i-th carrier [11]. The information bearing signals are filtered by the line transfer function H(f, d) depending on the frequency f and the distance d through the R, L, C, G line constants and we display results of a 0.5 mm diameter TP150 line [13]. The received SNRs are damaged by different sources of noise and interference like Additive White Gaussian Noise (AWGN), Alien Noise created by other systems in the lower part of the frequency plans, crosstalk between two lines: NEXT (Near-End Crosstalk) and FEXT (Far-End Crosstalk). The spectral density power functions of the NEXT and the FEXT are the transmitted power density filtered by [13]: 0.75 f 4 1 − |H(f, d)| (6) |HNEXT (f, d)| = KN f0 f d |HFEXT (f, d)| = KF |H(f, d)| (7) f0 d0
QAM
RS+TCM
BCH(64,57)2
RS+BCH(64,57) 2
BPSK
4.5
7.5
7.2
4QAM
6.8
7.5
7.2
8QAM
6.8
8.0
7.7
16QAM
6.2
7.5
7.3
32QAM
5.8
8.0
7.7
64QAM
6.0
8.5
8.2
128QAM
5.9
9.0
8.7
256QAM
5.8
9.2
9.0
512QAM
5.8
9.5
9.2
1024QAM
5.8
9.8
9.5
2048QAM
5.7
10.0
9.7
4096QAM
5.6
10.0
9.7
8192QAM
5.6
10.4
10.0
16384QAM
5.7
10.8
10.4
32768QAM
5.9
11.0
10.6
where KN = 10−2.5 , KF = 10−2.25 , d0 = 1 km, f0 = 1 MHz. The overall noise power then depends on each noise contribution [13]: ⎞0.6 ⎛
1/0.6 ⎠ (Ni ) (8) N =⎝ i∈NEXT, FEXT ,Alien,AWGN
The resulting SNR(i) designating the received Signal to Noise Ratio on the i-th carrier in equation (5) are then finally are then reduced by an 8 dB system’s margin. B. Channel Coding For VDSL Systems As higher bit rates on the same channel lead to larger bandwidth occupancy with dramatically higher attenuations, some very robust error correcting codes are required. A classical channel coding dedicated to the wired local loop is the concatenation of a 4D Wei’s Treillis Coded Modulation (TCM) with a (255,239) Reed Solomon (RS) outer code (RS+TCM) [11]. Since the time that ADSL was initially conceived, soft decoding has experienced a breakthrough with the use of iterative turbo decoding. We thus compare results about a VDSL2 system using various coding schemes; namely, these coding schemes are the standard concatenation RS+TCM, a BCH(64,57)2 product code turbo decoded with Pyndiah’s algorithm [2] and the concatenation of the outer RS(255,239) code with a BCH(64,57)2 product code [2]. In order to perform the bitloading of the carriers, we display in Table I the coding gains of the RS+TCM of the BCH(64,57)2 and of the RS+BCH(64,57)2 for the various QAMs used by VDSL systems. The larger coding gains of iterative decoding schemes logically allow to achieve higher bit rates for any local loop length and for any number of interfering lines. Figure 2 and Figure 3 illustrate this improvement on a TP150 model line [13] in the case of
1815
ISIT 2006, Seattle, USA, July 9 14, 2006
49 NEXT + FEXT interfering lines respectively for the upload and for the download streams. This improvement can be appreciated in two different ways. For a given local loop with a given line length, there is a bit rate improvement provided by BTC schemes generally ranging between 10 to 20%. But also, for different bit rates classes of service , there is a range gain of the order of 100 meters. 30 2
BTC BCH (64,57) 2 RS (255,239) + BTC BCH (64,57) RS (255,239) + TCM
TABLE II C OMPARISON OF THE DELAYS INTRODUCED BY THE INTERLEAVING
Impulse noise duration (µs)
10
20
50
100
200
250
500
RS+TCM delay (ms)
6.75
7.5
8.25
8.25
10.75
11.75
23.5
RS+BCH(64,57) 2 delay (ms)
0.75
1.25
3.25
4
4
4
8
Upload bit rates (Mbits/s)
25
IV. C ONCLUSION
20
15
10
5
0 0
0.2
0.4
0.6
0.8
1
1.2
Line length (km)
1.4
1.6
1.8
2
Fig. 2. Upload bit rates for different coding schemes and 49 interfering lines (Asymmetric frequency plan 998)
50 BTC BCH (64,57)2 RS (255,239) + BTC BCH (64,57)2 RS (255,239) + TCM
45
Download bit rates (Mbits/s)
40 35 30 25 20 15 10 5 0
0.2
0.4
0.6
0.8
1
1.2
Line length (km)
1.4
1.6
1.8
2
Fig. 3. Download bit rates for different coding schemes and 49 interfering lines (Asymmetric frequency plan 998)
C. Impulse Noise Short time and high magnitude impulse noise is another source of degradation for xDSL systems [14]. A traditional way to face this problem is to use a convolutional interleaver. Table II gives the necessary delay introduced by interleaving in order to reach a 10−7 BER target in an impulse noise environment; the impulse noise spectral density power is equal to 100 dBm/Hz and the variable impulse noise duration can affect up to three DMT symbols (500 µs). The shorter RS+BCH(64, 57)2 interleaving depths are achieved because we display in this case results with soft decoding ; when the receiver detects unexpected DMT energy, the bits are erased and the RS outer decoder performs erasure decoding.
This paper presented a particularly low-complexity procedure to implement block turbo decoding and leading to the same results as Pyndiah’s original work [2]. This lowcomplexity allows the implementation of larger codes for which the Chase-Pyndiah algorithm gives near-capacity results in only four iterations. This algebraic based decoding scheme seems particularly suited for high data rate systems. This has been illustrated in the case of the Very high bit rate Digital Suscriber Line systems. R EFERENCES [1] J. Hagenauer, E. Offer, and L. Papke, “Iterative decoding of binary block and convolutional codes,” IEEE Trans. Commun., vol. 44, pp. 429–445, Mar. 1996. [2] R. M. Pyndiah, “Near optimum decoding of product codes: block turbo codes,” IEEE Trans. Commun., vol. 46, no. 8, pp. 1003–1010, Aug. 1998. [3] D. Chase, “A class of algorithms for decoding block codes with channel measurement information,” IEEE Trans. Info. Theory, vol. IT-18, pp. 170–182, Feb. 1972. [4] P. Adde and R. Pyndiah, “Recent simplifi cations and improvements in block turbo codes,” in 2nd Int. Symposium on Turbo codes and related topics, Brest, France, Sept. 2000, pp. 133–136. [5] Z. Chi, L. Song, and K. K. Parhi, “On the performance/complexity tradeoff in block turbo decoder design,” IEEE Trans. Commun, vol. 52, no. 2, pp. 173–175, Feb. 2004. [6] C. Argon and S. W. McLaughlin, “A parallel decoder for low latency decoding of turbo product codes,” IEEE. Commun. Letters, vol. 6, no. 2, pp. 70–72, Feb. 2002. [7] S. A. Hirst, B. Honary, and G. Markarian, “Fast chase algorithm with an application in turbo decoding,” IEEE Trans. Commun., vol. 49, pp. 1693–1699, Oct. 2001. [8] B. Geller, “Contribution a` l’etude des syst`emes de communications numeriques,” Habilitation a` Diriger des Recherches, Universite Paris 12, France, Dec. 2004. [9] C. Vanstraceele, “Turbo codes et estimation parametrique pour les communications a` haut debit,” PhD dissertation, ENS de Cachan, France, 2005. [10] S. Kerouedan, P. Adde, and R. Pyndiah, “How we implemented block turbo codes?” in Annals of telecommunications, vol. 56, no. 7-8, 2001, pp. 447–454. [11] T. Starr, J. M. Cioffi , and P. J. Silverman, Digital Suscriber Line Technology. Upper Saddle River, NJ: Prentice Hall, 1999. [12] J. A. C. Bingham, ADSL, VDSL and Multicarrier Modulation. WileyInterscience, 2000. [13] Very-high-bit-rate Digital Subscriber Lines (VDSL) Metallic Interface, ANSI Std. T1.E1.4/2003-210R1, Aug. 2003. [14] D. Toumpakaris, J. M. Cioffi , and D. Gardan, “Reduced-delay protection of dsl systems against nonstationary disturbances,” IEEE Trans. Commun., vol. 52, no. 11, pp. 1927–1938, Nov. 2004.
1816