Efficient Packet Erasure Decoding by Transforming the Systematic Generator Matrix of an RS Code D.J.J. Versfeld
H.C. Ferreira
A.S.J. Helberg
School for Electrical and Information Engineering University of the Witwatersrand Johannesburg, South Africa
[email protected]
Department of Electrical and Electronic Engineering Science University of Johannesburg Johannesburg, South Africa
[email protected]
School for Electrical and Electronic Engineering North-West University Potchefstroom, South Africa Email:
[email protected]
Abstract— We consider the erasures-only decoding of ReedSolomon codes. One class of erasure decoders of algebraic codes can be viewed as a transformation of the systematic generator matrix, as first noticed by Berlekamp. We extend Berlekamp’s method to Reed-Solomon codes and compare the performance of the decoder with various other Reed-Solomon erasure decoders. The developed decoder reduces the decoding time of packet erasures significantly when used in conjunction with a modified interleaver found in the literature.
I. I NTRODUCTION The reliable transfer of information is an important challenge with many applications. Shannon observed that for a given channel, if data is transmitted at an information rate below capacity, the transmission can be made reliable. In practice, one method of accomplishing this reliable transfer is to deploy forward error correction. There has been a renewed interest in erasures-only decoding in the last decade. Many researchers have suggested that erasures-only decoding can be applied to packet-switched networks where packet loss occurs due to congestion and degrades the Quality of Service of the communication system. The future optical networks are also prone to packet loss, albeit due to other causes such as contention. Other systems where erasures-only decoding can be applied include hard disk arrays (RAID systems). In this paper, we extend the erasures-only decoding of linear binary codes proposed by Berlekamp [1] to erasuresonly decoding of Reed-Solomon codes. Berlekamp’s proposed decoder achieves erasures-only decoding by transforming the systematic generator matrix. The method we propose uses a consequence of Reed and Solomon’s paper [2] and is optimized to find the set of individual codewords constituting the transformed generator matrix. The technique leads to significant reductions in decoding time when optimized for use in conjunction with a modified interleaving scheme proposed by various researchers, including Kamali et al. [3] and Rizzo [4]. The remainder of this paper is structured as follows. Section II provides a brief overview of existing erasures-only decoders. The theory on which the decoder is based is presented in Section III and in Section IV, we develop the erasures-
978-1-4244-2271-5/08/$25.00 ©2008 IEEE
401
only decoding algorithm based on transforming the systematic generator matrix of a Reed-Solomon code. Section V presents results on the performance of the decoder compared to various decoders found in the literature. This is followed by our conclusions in Section VI. II. R ELATED W ORK In general, erasures-only decoders fall into two broad categories: syndrome-based decoding and interpolation-based decoding. With syndrome-based decoding, a syndrome must be calculated for each codeword. The syndrome and the parity-check matrix are then used to construct the erasure polynomial. Syndrome-based decoders are very efficient for high-rate codes, because the minimum distance of these codes is small, allowing only a small number of unknown coefficients to be determined for the erasure polynomial. The decoders of McAuley [5], Xu et al. [6], Kamali et al. [3] and Bl¨omer et al. [7] are syndrome-based decoders. In their landmark paper [2], Reed and Solomon used interpolation techniques to correct polynomials with corrupted coefficients. The crux of erasures-only interpolation-based techniques is that you can reconstruct a polynomial of degree k−1 using any k from n values, where the n values were obtained by evaluating the polynomial for certain predefined values. With interpolation-based techniques, no syndrome calculation is done. Interpolation-based decoders are very efficient for low-rate codes, because we only need to invert a k × k submatrix of the generator matrix. The decoders of [1], [4] and [8] are based on interpolation. In the next section, we develop an interpolation-based erasures-only decoder based on transforming the systematic generator matrix. III. E RASURES - ONLY D ECODING BASED ON B ERLEKAMP ’ S M ETHOD Berlekamp [1] used a technique that he termed ‘basis vector transformations’ to accomplish erasures-only decoding of binary linear codes. First of all, it is well known that any linear code over a finite field with q elements Fq can be viewed as a linear subspace C ⊂ Fqn [9]. Furthermore, a subset v1 , v2 , . . . , vk of vectors in a vector (sub)space V
that are linearly independent and have vector space span V constitutes a basis of V . As a consequence, a set of vectors in V , (v1 , v2 , . . . , vk ), form a basis of V if and only if every vector v, v ∈ V , can be uniquely written as v = a1 v1 + a2 v2 + . . . + ak vk ,
(1)
where a1 , . . . , ak are elements of the base field. In addition, V has many different bases and each basis has the same number of basis vectors [10]. Each basis forms a generator matrix of the code C. A feature of a special set of generator matrices is that their encoding results in codewords where each codeword can be partitioned in an information set and a redundancy set [11, p.4]. In order for each codeword to contain an information set and a redundancy set, the generator matrix G must contain a set of k independent columns. When the first k coordinates form an information set, the code has a unique generator matrix of the form [Ik |A] where Ik is the k ×k identity matrix and the generator matrix is also called the systematic generator matrix [11]. The rest of the generator matrices in this set we will call quasi-systematic. The crux of Berlekamp’s decoder [1] is to change the information set and calculate the corresponding quasi-systematic matrix according to the erasure pattern. The redundancy set will then correspond with the erased symbols. Thus, encoding with the chosen information set and the corresponding quasisystematic matrix will recover the erased symbols. Berlekamp focused on binary codes. For our purposes, we will extend Berlekamp’s method to codes over GF (2m ). However, we will narrow our scope down to Maximum Distance Separable (MDS) codes only. For MDS codes, we obtain the following definition for a quasi-systematic matrix. Definition 1 (Quasi-systematic Matrix of an MDS Code): Consider an (n,k) MDS linear code C. A generator matrix B is said to be a quasi-systematic matrix of C if the matrix B contains a partitioned k × k identity matrix, i.e. B should contain k unit column vectors and the unit column vectors must be in ascending order, not necessarily adjacent. The following two lemmas will assist us in developing the erasures-only decoder for Reed-Solomon codes (a subclass of MDS codes). Lemma 1 (Number of zero symbols): If cmin ∈ C, where C is an MDS code and the Hamming weight of cmin equals the minimum distance dmin of C, then cmin will have k − 1 zero symbols. Proof: For C to be MDS, C must have a minimum Hamming distance dmin of n − k + 1. The number of zero symbols for any minimum weight codeword cmin of an MDS code is equal to n−dmin or n−(n−k +1). Thus, the number of zero symbols of cmin is k − 1. To state Lemma 1 in other words, the maximum allowable zero symbols that any codeword of an MDS code can have is k − 1. Lemma 2 (Hamming weights of the basis vectors): Each basis vector of a quasi-systematic matrix of an MDS code
402
must be of minimum Hamming weight. Proof: By Def. 1, each (row) vector in the quasisystematic matrix will have k − 1 zero symbols, due to the partitioned k × k identity matrix. As each vector in the quasisystematic matrix is also a codeword of the MDS code, all the other symbols should be non-zero (by Lemma 1) and the vector will have minimum weight equal to dmin . To accomplish erasures-only decoding, find the quasisystematic matrix B of the code C such that the positions of the unit vector columns of B correspond to the positions of a set of k of the received symbols. As an example of a quasisystematic matrix of an MDS code, consider an (7, 4) linear MDS code. Assume that the first, fourth and sixth symbols were erased. The corresponding quasi-systematic matrix will have the form: b1,1 1 0 b1,2 0 b1,3 0 b2,1 0 1 b2,2 0 b2,3 0 B= (2) b3,1 0 0 b3,2 1 b3,3 0 , b4,1 0 0 b4,2 0 b4,3 1 where the elements bi,j are non-zero elements in the GF (2m ) field. Multiplying the vector containing the k received data symbols by B will result in the original codeword. To recover an erased symbol, only a vector product of the parity column multiplied by the vector containing the k received symbols is necessary. (No syndrome computation is required.) However, finding the correct set of basis vectors for the non-binary case, i.e. codes over GF (2m ), remains a substantial and time-consuming task. One way of computing the matrix B is to use Gauss Reduction. This is in effect what Rizzo’s method [4] accomplish. Gauss Reduction uses complexity O(k 3 ) to find the matrix B, when no optimizations are considered. We will now develop the erasures-only decoder for ReedSolomon codes that finds the k codewords (base vectors) constituting the quasi-systematic matrix by using a consequence of Reed and Solomon’s paper [2]. Reed and Solomon’s original approach is based on interpolation. The idea is to take k data symbols {m0 , m1 , · · · , mk−1 } from a finite field GF (q) and construct the polynomial P (x) = m0 + m1 x + · · · + mk−1 xk−1 [12]. The ReedSolomon codeword c corresponding to P (x) is calculated as (P (0), P (α), P (α2 ), · · · , P (1)) [2]. We will use a slightly different approach to construct the Reed-Solomon codes. We will represent our codewords as the n-tuples c = (P (α0 ), P (α1 ), · · · , P (αn−1 )).
(3)
From Definition 1, we know that the k codewords of the quasi-systematic matrix will have minimum weight. We will use the following Lemma [13] to construct codewords of minimum weight for a Reed-Solomon code. Lemma 3 (Minimum Weight Codewords of RS Codes): In order to construct a codeword cmin of minimum weight such that the symbols with indices {i0 , i1 , · · · , ik−2 } are equal to zero, construct the polynomial u(x) = s · (x − αi0 )(x − αi1 ) · · · (x − αik−2 ), where αij and s
are non-zero elements of GF (2m ). cmin can be found by evaluating u(x) at {α0 , α1 , · · · , αn−1 }. Proof: We know that a codeword with the symbols at indices {i0 , i1 , · · · , ik−2 } equal to zero will have minimum weight, as k − 1 symbols are zero, following Lemma 1. The polynomial u(x) has degree k − 1, due to the fact that it consists of k − 1 factors of degree 1. Thus, by using (3), and substituting u(x) for P (x), the resulting codeword c will be a Reed-Solomon codeword. The codeword c will have k − 1 symbols (or coefficients), which are zero at indices {i0 , i1 , · · · , ik−2 }. Substituting αij into u(x) will result in zero, as αij is a root of u(x). This is due to the fact that (x − αij )|u(x). Using (3), the symbol at index ij will be calculated by substituting αij in P (x), which, as we saw, results in a zero symbol. As u(x) has k−1 different roots, c will have k − 1 zero symbols. From Lemma 2 we know that each vector in the quasisystematic matrix will have minimum weight. Lemma 3 provides a way of computing the individual basis vectors constituting the quasi-systematic matrix. This enables us to develop the erasures-only decoder. IV. A N OPTIMIZED ERASURES - ONLY DECODING ALGORITHM
Assume that a Reed-Solomon codeword c is sent over an erasure channel and the vector r is received. Furthermore, assume that the coefficients at indices b1 , b2 , · · · , bk are error free and the rest of the coefficients are flagged as erasures. Define a polynomial F (X) as F (X) = (X − αb1 )(X − αb2 ) · · · (X − αbk ).
(4)
We need to construct a codeword for the i-th row of the generator matrix such that the following properties hold: • the codeword must have minimum weight, • the coefficients at indices b1 , b2 , · · · , bi−1 , bi+1 , · · · , bk must be zero and • the coefficient at index bi must be nonzero. A codeword with these properties can be determined by evaluating the polynomial fi (X) given by fi (X) = F (X)/(X − abi ).
(5)
In order to get the coefficient at index bi equal to one, we divide fi X by fi (αbi ) to form the polynomial fi0 (X): fi0 (X) = fi (X)/fi (αbi ).
(6)
fi0 (X)
It is clear that the polynomial is zero at αb1 , αb2 , · · · , αbi−1 , αbi+1 , · · · , αbk and it is equal to one at αbi . The rest of the coefficients of the i-th row of the generator matrix is determined by evaluating fi0 (X). in order to reduce the number of computations, we first calculate and store the values cj determined by evaluating F (X): cj = F (αj ), j ∈ 1, 2, . . . , n − b1 , b2 , . . . , bk .
(7)
In order to compute the values of fi (X) at αbi we compute
403
H(X) = DX [F (X)]
(8)
where DX [.] is the Hasse-derivative with respect to X. We then have that fi (αbi ) = H(αbi ). We also calculate and store the values vi , where vi is determined by evaluating H(x) as follows: vi = H(αbi ), i = 1, 2, . . . , k.
(9)
Following the above reasoning, we can express the nonidentity part of the generator matrix G = [gi,j ] as follows: gi,j =
cj . (αj − αbi )vi
(10)
The above algorithm takes O(n2 ) operations to complete and the space requirements are O(n). In the next section, we present some results on the performance of the decoders. V. R ESULTS In this section, we compare the performance of various erasure decoders with the decoder developed in the previous sections. We will use the erasure decoders in conjunction with the modified interleaver (see [4] and [3]). Our simulations were done on a 1.6GHz Pentium 4, using Java. We created look-up tables for the multiplications, additions and inversions of the finite elements. (Division was accomplished by finding an inverse and then doing the multiplication.) We investigated two interpolation-based decoders, denoted as “RIZZO S” and “RIZZO N”, discussed in [4](the first uses a systematic code while the latter uses a non-systematic code). Three syndrome-based decoders were investigated. The decoder by Bl¨omer et al. [7] is denoted as “BLOEMER”, the decoder developed by Kamali et al. [3] is denoted as “KAMALI P” and the decoder developed by Xu et al. [6] is denoted by “XU P”. The interpolation-based decoder of Section III is denoted as “BVTD”. The basis vectors of a quasi-systematic matrix can also be constructed by using known erasure decoders. For our purposes, we used the decoders of Kamali et al. [3] and Xu et al. [6] to compute the decoding matrix B, effectively forming interpolation-based decoders. We denoted these decoders as “KAMALI *” and “XU *” respectively. All of the above decoders were optimized for packet erasures by exploiting the fact that the erasure positions were duplicated in all the codewords of the interleaver. In Fig. 1(a) and Fig. 1(b), we investigated the performance of the decoders using an (255,10) Reed-Solomon code, with an interleaver depth of 1 and 1500, respectively. In Fig. 2(a) and Fig. 2(b), we investigated the performance of the decoders using an (255, 200) Reed-Solomon code, with an interleaver depth of 1 and 1500, respectively. From these figures, it is confirmed that for high rate codes and small interleaving depths, the best performance is obtained by using syndrome-based decoders. On the other hand, if we
3
5
10
10
2
10
4
10
BVTD BLOEMER RIZZO S RIZZO N XU + XU * KAMALI + KAMALI *
1
10
BVTD BLOEMER RIZZO S RIZZO N XU + XU * KAMALI + KAMALI *
0
10
3
10
2
10 −1
10
−2
10
1
1
2
3
4 5 Number of packet erasures
6
7
10
8
1
2
(a) Interleaver depth of 1 Fig. 1.
3
4 5 Number of packet erasures
6
7
8
(b) Interleaver depth of 1500
Number of packet erasures versus decoding time using an (255,10) Reed-Solomon code
3
5
10
10
BVTD BLOEMER RIZZO S RIZZO N XU + XU * KAMALI + KAMALI *
2
10
4
10
1
BVTD BLOEMER RIZZOS
3
10
10
RIZZON XU + XU * KAMALI + KAMALI *
0
10
10
2
15
20
25
30 35 Number of packet erasures
40
45
50
10
55
15
20
25
30
35
40
45
50
Number of packet erasures
(a) Interleaver depth of 1 Fig. 2.
10
(b) Interleaver depth of 1500
Number of packet erasures versus decoding time using an (255,200) Reed-Solomon code
use low rate codes, the best performance is obtained with interpolation-based decoders. As we increase the interleaving depth of the modified interleaver, the performance of the syndrome-based decoders decrease relative to the performance of the interpolation-based decoders. The explanation for this is that most of the time is spent on the syndrome computations of the codewords in the interleaver for the syndrome-based decoders, while only vector products are necessary for the decoders based on basis vector transformations. The two decoders giving the best consistent performance for all cases is the decoder of Bl¨omer et al. and the decoder developed in Section IV. Although the decoder of Bl¨omer et al. is classified as a syndrome decoder, it performs well for low as well as high rate codes. This is due to the manner how the syndrome is calculated. (The syndrome is only dependent on the number of packet erasures, and not on the rate of the code, as is the decoders of Kamali et. al [3] and Xu et. al [6].) For larger numbers of packet erasures, and especially for larger interleaver depths, the decoder of Section III yields the
404
best performance, outperforming the decoder of Bl¨omer et al. VI. C ONCLUSION The authors developed an erasures-only decoder for ReedSolomon codes by extending a method proposed by Berlekamp and using a consequence of Reed and Solomon’s original approach for constructing Reed-Solomon codes. The crux of Berlekamp’s method is to transform the systematic generator matrix to a quasi-systematic generator matrix, according to the erasures experienced, and then to multiply a new information set with the quasi-systematic matrix in order to accomplish decoding. Berlekamp used Gauss reduction to transform the systematic generator matrix to the appropriate quasi-systematic matrix. By observing that the basis vectors constituting the quasi-systematic matrix are of minimum weight, we use a consequence of Reed and Solomon’s original approach to construct the individual basis vectors. We also show an optimization for calculating these basis vectors. The developed decoder does not calculate syndromes for the
received vectors. When used in conjunction with a modified interleaver proposed by several researchers, the decoder have significant reductions in decoding time when the interleaving depth is large enough. ACKNOWLEDGMENTS We would like to thank Telkom SA Ltd for their financial support. This material is based on work also supported by the National Research Foundation under Grant number 2053408. The authors would like to make use of this opportunity to thank Dr Ludo Tolhuizen, Prof. Dilip Sarwate and Prof. Ruud Pellikaan for their invaluable inputs and to acknowledge Prof. Dilip Sarwate for his help with Lemma 3. The comments from the anonymous reviewers for a previous draft submitted to the IEEE Transactions on Information Theory were very helpful and constructive. Special mention should be made to reviewer Y, who gave the optimized version of the algorithm as presented in Section IV. R EFERENCES [1] E. Berlekamp, “Long block codes which use soft decisions and correct erasure bursts without interleaving,” in Proc. of the National Telecommunications Conference, Los Angeles, USA, 1977, pp. 1–2. [2] I. Reed and G. Solomon, “Polynomial codes over certain finite fields,” SIAM Journal of Applied Mathematics, vol. 8, pp. 300–304, 1960. [3] B. Kamali and P. Morris, “Application of erasure-only decoded ReedSolomon codes in cell recovery for congested ATM networks,” in Vehicular Technology Conference, 2000, pp. 982–986. [4] L. Rizzo, “Effective erasure codes for reliable computer communication protocols,” ACM Computer Communication review, vol. 27, pp. 24–36, Apr. 1997. [5] A. J. McAuley, “Reliable broadband communication using a burst erasure correcting code,” in Proc. ACM SIGCOMM ’90; (Special Issue Computer Communication Review), Sept. 1990, pp. 297–306, published as Proc. ACM SIGCOMM ’90; (Special Issue Computer Communication Review), volume 20, number 4. [6] Y. Xu and T. Zhang, “Variable shortened-and-punctured Reed-Solomon codes for packet loss protection,” IEEE Transactions on Broadcasting, vol. 48, pp. 237–245, 2002. [7] J. Bl¨omer, M. Kalfane, R. Karp, M. Karpinski, M. Luby, and D. Zuckerman, “An XOR-based erasure-resilient coding scheme,” California, 1995. [Online]. Available: citeseer.ist.psu.edu/84162.html [8] G.-L. Feng, R.-H. Deng, and F. Bao, “Packet-loss resilient coding scheme with only XOR operations,” IEE Proceedings on Communications, vol. 151, pp. 322 – 328, Aug. 2004. [9] E. W. Weisstein, “Linear code.” http://mathworld.wolfram.com/LinearCode.html, 2005. [10] E. W. Weisstein et al., “Vector space basis.” http://mathworld.wolfram.com/VectorSpaceBasis.html, 2005. [11] W. C. Huffman and V. Pless, Fundamentals of Error-Correcting Codes. Cambridge: Cambridge University Press, 2003. [12] S. B. Wicker and V. K. Bhargava, Reed-Solomon codes and their applications. New York: Institute of Electrical and Electronic Engineers, Inc., 1994. [13] D. V. Sarwate, private correspondence, 2004.
405