ISIT 2006, Seattle, USA, July 9 14, 2006
On the Performance of Multivariate Interpolation Decoding of Reed-Solomon Codes Farzad Parvaresh
Mohammad H. Taghavi
Alexander Vardy
University of California San Diego La Jolla, CA 92093, U.S.A.
[email protected]
University of California San Diego La Jolla, CA 92093, U.S.A.
[email protected]
University of California San Diego La Jolla, CA 92093, U.S.A.
[email protected]
1
I. I NTRODUCTION Reed-Solomon codes are ubiquitous, with applications ranging from magnetic recording to deep-space communications. The classical decoding algorithms (employed in all these applications today) correct up to 1/2 n(1− R) errors in a ReedSolomon code of length n and rate R. In two breakthrough papers, Sudan [9] and Guruswami-Sudan [6] showed that we can do much better: √ Reed-Solomon codes could be used to correct up to n(1− R) errors, if the decoder is allowed to output a small list of codewords. As it turns out, in practice, the probability that the Guruswami-Sudan decoder actually produces more than one codeword is usually negligible (cf. [8]). More recently, several papers [1–3,5,10–12] √showed that the Guruswami-Sudan decoding radius τ GS = 1− R could be substantially improved upon, under certain conditions. All these papers consider the following scenario: M codewords of an (n, k, d) Reed-Solomon code C over Fq are decoded together and the errors are assumed to be synchronized — that is, the error positions are the same for all the M codewords. Alter-
1424405041/06/$20.00 ©2006 IEEE
Berlekamp−Massey (1/2)(1−R) 2/3 CS 1−R−R BKY (2/3)(1−R) 2/3 MID 1−R
0.9
Fraction of errors corrected
Abstract— The multivariate interpolation decoding (MID) algorithm for certain Reed-Solomon codes was recently introduced by Parvaresh and Vardy. attempts to list-de The MID algorithm code up to nτMID = n 1 − R M/( M+1) errors, in a Reed-Solomon code of length n and rate R, using ( M +1)-variate polynomial interpolation. This√improves on the Guruswami-Sudan decoding radius of τGS = 1− R by a large margin, especially for high-rate codes. The problem is that successful decoding is not guaranteed: there are certain patterns of less than nτMID errors which the MID algorithm fails to decode. Nevertheless, simulations show that the actual performance of the MID decoder is very close to what one would expect if all patterns of up to nτMID errors were corrected. On the other hand, analysis of the failure probability for the MID algorithm is extremely difficult, and there were no analytic results so far to confirm this empirically observed behavior. In this work, we provide such analytic results: we present a detailed analysis of the probability of failure in the MID algorithm for the special case where M = 2 and the interpolation multiplicity is m = 1. In this case, the MID algorithm attempts to correct √ 3 up to nτ2,1 errors, where τ2,1 = 1 − 6R2 . We consider the situation where symbol values received from the channel at the erroneous positions are distributed uniformly at random (a version of the q-ary symmetric channel). We show that, with high probability, the performance of the MID algorithm is very close to the optimum in this case. Specifically, we prove that if the fraction of positions in error is at most τ2,1 − O( R 5/3 ), then the probability of failure in the MID algorithm is at most n −Ω( n). Thus the probability of failure is, indeed, negligible for large n in this case.
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Rate of the code
Figure 1. Error-correction radii of BKY, CS, and MID algorithms
natively, one can think of the entire M × n array, having the M codewords of C as its rows, as a single codeword of an (n, k, d) code C over the field of order Q = q M . Then, if a codeword of C is corrupted by some t errors, these errors are necessarily synchronized with respect to the constituent codewords of C. Curiously, the code C is actually a Reed-Solomon code over FQ (as shown in [12]). Polynomial-time decoding algorithms for this framework have been recently developed by Bleichenbacher, Kiayias, and Yung [1], by Coppersmith and Sudan [3], and by Parvaresh and Vardy [10,12]. The three algorithms are based upon very different ideas, but have several common features. First, all the three attempt to decode significantly √beyond the the Guruswami-Sudan decoding radius τ GS = 1− R. The error-correction radii of the three algorithms are given by
τBKY =
M M +1
M M 1− R , τCS = 1− R − R M+1 , τMID = 1 − R M+1
respectively. These are plotted in Figure 1 for the special case M = 2. Second, in all the three cases, successful decoding up to this radius is not guaranteed: there are certain specific error patterns of less than nτBKY (or nτCS , or nτMID , respectively) errors which cause a decoding failure. The probability of such decoding failure for the BKY and CS decoders is analyzed in great detail in [1, 2] and in [3], respectively. It would be nice to have a similar probabilistic analysis for the multivariate interpolation decoder (MID) of Parvaresh and Vardy [10]. Unfortunately, this task appears to be extremely difficult; the present work can be regarded as the first step toward this goal. As suggested by the referees, it might be worth indicating here some of the advantages of the multivariate interpolation approach of [10,12] over the alternatives. First as can be seen
2027
ISIT 2006, Seattle, USA, July 9 14, 2006
from Figure 1, the error-correction radius τ MID of the MID algorithm is greater than τBKY or τCS for all rates. Second, while the BKY and CS decoders simply fail for certain error patterns, the MID algorithm offers a graceful degradation option: decrease the decoding radius slightly, and try again; repeat as needed, until the “failure” is resolved. Third, we have observed in simulations that while the probability of decoding failure is significant for the CS decoder, it is often negligible for the MID algorithm. Finally, it is now known [5, 11] that the multivariate interpolation approach leads to deterministic (adversarial errors) decoding beyond the Guruswami-Sudan radius, and eventually to deterministic decoding up to the ultimate radius of 1 − R. However, the variations of the MID algorithm in [5, 11] that achieve such deterministic decoding necessitate a price in the code rate, the decoding complexity, the list-size, and so forth. Is paying this price justified in practice? The only way to answer this question is to understand the probability of failure in the original MID algorithm. In this paper, we study a relatively simple special case where M = 2 and the interpolation multiplicity is m = 1 (as opposed to m → ∞, which is needed to achieve the error-correction radii τCS and τMID ). In this case, the MID algorithm attempts √ 3 to correct at most nτ 2,1 errors, where τ 2,1 = 1 − 6R2 . We consider the situation where symbol values received from the channel at the erroneous positions are distributed uniformly at random and show that, with high probability, the performance of the MID algorithm is very close to the optimum in this case. Our main result is the following theorem. Theorem 1. Let C be an (n, k, d) Reed-Solomon code over the field of order Q = q2 obtained by evaluating polynomials of degree < k over FQ in a set of points { x 1 , x 2 , . . . , x n } ⊆ Fq . Further assume that 6(n + 1) 3 (1) − k−1 = s(s + 1)(s + 2) s
for an integer s 1. Suppose that a codeword c of C is transmitted over a Q-ary symmetric channel and the number of channel errors satisfies √ 3 (2) t n 1 − 6R2 − O R5/3 Then the multivariate interpolation decoder recovers c from the channel output with probability at least 1 − n −Ω(n) . Theorem 1 shows that the probability of failure in the MID algorithm is, indeed, negligible for large n in this special case. Due to space limitations, we will provide only a sketch of the proof of Theorem 1 in what follows. II. BACKGROUND AND N OTATION Let Fq denote the finite field with q elements, let Q = q 2 , and let {1, β} be a fixed basis for FQ over Fq . We will consider two Reed-Solomon codes Cq (n, k) and CQ(n, k), both obtained by evaluating polynomials of degree k − 1 in the same set of points D = { x 1 , x 2 , . . . , x n } ⊆ Fq . Throughout this paper, we assume that n, k are both θ (q), namely k = Rn and n = ηq,
where R and η are constants. Specifically, the codes Cq (n, k) and CQ(n, k) are defined as follows: def f ( x 1 ), . . . , f ( x n ) : f ∈ Fq [ X ], deg f < k Cq (n, k) = def CQ(n, k) = c1 +βc1 , . . . , c n +βcn : c, c ∈ Cq (n, k) We shall, moreover, assume that k satisfies (1). We point out that this limits the validity of our results to only certain rates. Suppose that a codeword c of C Q(n, k) is transmitted over a noisy channel and the vector v = (v 1 , v 2 , . . . , v n ) ∈ FQn is received at the channel output. We write v j = y j + β z j with y j and z j in Fq for all j = 1, 2, . . . , n, and define def (3) Pv = ( x 1 , y 1 , z 1 ) , ( x 2 , y 2 , z 2 ) , . . . , ( x n , y n , z n ) We let I(Pv ) ⊂ Fq [ X, Y, Z ] denote the ideal of polynomials over Fq that pass through all the points in Pv , namely def I(Pv ) = P : P( x j , y j , z j ) = 0 ∀ ( x j , y j , z j ) ∈ Pv (4) We define the (1, k−1, k−1)-weighted degree, or simply the weighted degree, of a monomial X a Y b Z c as follows: W deg X
a b
def
Y Zc = a + (k − 1) b + (k − 1) c
(5)
We extend the weighted degree in (5) to a monomial order ≺ W by augmenting it with the lex order, and define the weighted degree of a polynomial as the weighted degree of its leading monomial. We denote the minimal Gr¨obner basis, with respect to the ≺W order, for the ideal I(Pv ) by def G(Pv ) = G1 ( X, Y, Z), G2 ( X, Y, Z), . . . , G ( X, Y, Z) where G1 ≺W G2 ≺W · · · ≺W G . The following theorem is one of the main results of Parvaresh and Vardy in [10, 12]. Theorem 2. Let t = d(c, v) be the number of positions in v that are in error. Then multivariate interpolation decoding successfully recovers the transmitted codeword c provided GCD G ∈ G(Pv ) : W deg G < n − t ∈ Fq [ X ] (6) Herein, we will not be concerned with the details of the MID algorithm; rather, we’ll use the sufficient condition (6) as our guiding principle. It is clear from (6) that our task involves careful estimation of the weighted degrees of the polynomials in G(Pv ). To this end, the following function will be useful def (7) N (∆) = X a Y b Z c : W deg X a Y b Z c ∆ Observe that
∆3 (∆ + k)3 < N ( ∆) (8) 2 6(k − 1) 6(k − 1)2 as shown in [6,10,11]. However, in this paper, we will need to be more precise. Specifically, we will use the fact that N s(k− 1) = 1/6 (s + 1)(s + 2) s(k− 1) + 3 (9)
for all positive integers s (this fact is the source for the condition (1) on k and n in Theorem 1). The correctness of (9) can be established by straightforward enumeration.
2028
ISIT 2006, Seattle, USA, July 9 14, 2006
xj1
III. A B OUND ON THE FAILURE P ROBABILITY Before diving into the technical details, we first describe the general strategy of our proof. The key idea is to show that, in most cases, the minimal element of G(Pv ) — namely, the polynomial G1 ( X, Y, Z ) — will be irreducible. More precisely, we will show that G1 ( X, Y, Z ) = p( X ) G ∗( X, Y, Z ), for an irreducible polynomial G ∗ ( X, Y, Z ). Loosely speaking, to prove that G1 ( X, Y, Z ) is of this form, we will establish an upper bound on W deg G1 (Lemma 3) as well as a lower bound on W deg G1 which holds with high probability (Lemma 4). Next we show in Lemma 5 that if G1 is not of the desired form, then the lower bound exceeds the upper bound — a contradiction. Now, if G1 ( X, Y, Z ) = p( X ) G ∗( X, Y, Z ), where G ∗( X, Y, Z ) is irreducible, then Theorem 2 implies that the MID algorithm will correct t errors as long as n − t is greater than W deg G, where G is any element of G(Pv ) that does not have G ∗ as its factor. Finally, using the fact that the size of the delta-set of the ideal I(Pv ) is n, we derive an upper bound on the weighted degree of G, and Theorem 1 follows. Lemma 3.
W deg G1
s( k − 1 )
Proof. Let P ∈ Fq [ X, Y, Z ]. Then P ∈ I(Pv ) if and only if the coefficients of P satisfy n linear equations — one for each point in Pv . Thus we can think of the coefficients of P as unknowns in a system of n linear equations. If P has at least n + 1 nonzero coefficients, then the system is guaranteed to have a nonzero solution. It follows that I(Pv ) contains a polynomial of weighted degree ∆ provided N (∆) n + 1. But N s(k−1) = n + 1 by (1) and (9), and so I(Pv ) contains a polynomial of weighted degree s(k − 1). Since G1 is the minimal element of G(Pv ), it has the smallest weighted degree among all polynomials in I(Pv ), and the lemma follows. def
Let us fix a set E = { x 1 , x 2 , . . . , x t } ⊂ D , where D ⊆ Fq is the evaluation set for both Cq (n, k) and CQ(n, k). Now let y1 , y 2 , . . . , y t , z1 , z2 , . . . , z t be i.i.d. random variables distributed uniformly over Fq , and define def (10) P = ( x1 , y1 , z1 ), ( x2 , y2 , z2 ), . . . , ( x t , yt , zt ) as in (3). As in (4), let I(P ) ⊂ Fq [ X, Y, Z ] be the ideal of polynomials that pass through all the points in P . Pick any polynomial Q( X, Y, Z ) in I(P ). We will study W deg Q, which is a random variable. First, observe that (11) W deg Q min t, k − 1 Indeed, if Q has positive degree in either Y or Z, then we have W deg Q k − 1 by (5). Otherwise, Q ∈ Fq [ X ] and it must be divisible by ∏ xi ∈E ( X − x i ), in which case W deg Q t. Lemma 4. For all ε > 0, with probability at least 1 − n −Ω(n) over the choice of y 1 , y 2 , . . . , y t and z1 , z2 , . . . , z t , we have √ 3 2 (12) W deg Q min t − k, 6t ( k −1 ) − k − εn ¯ its Proof. Let A(t, ε) be the event in (12), and letA(t, ε−)Ωbe ¯ complement. We need to show that Pr A(t, ε) n (n) .
xj2
xj|X|
Curves
q2 points
zeros of P(x,Y,Z) [at most q∆ /(k−1) points]
Figure 2. Counting the number of curves
We say that a point x ∈ E is an X-zero of Q if ( X − x) is a factor of Q. For each subset Z ⊆ E , we let E Z denote the event that the set of all X-zeros of Q is exactly Z . Then Pr A¯ (t, ε) = ∑ PrZ { A¯ (t, ε)} Pr{ E Z } (13) Z ⊆E
=
∑ PrZ { A¯ (t, ε)} Pr{EZ }
(14)
|Z |< t − k
where PrZ is the conditional probability measure Pr · | EZ . Let us explain the second equality above. Given that E Z occurred, we can factor Q as follows
Q( X, Y, Z) = P( X, Y, Z)
∏ ( X − x)
(15)
x∈Z
This makes it clear that W deg Q = |Z | + W deg P. Hence, if |Z | t − k, then W deg Q is greater than the right-hand side of (12), and Pr Z { A¯ (t, ε)} = 0. Now let X = E \Z , and let B∆ be the event that W deg P ∆ in the factorization of (15). We derive a bound on Pr{ B∆ ∩ EZ } in what follows. First, observe that X − x is not a factor of P for all x ∈ X (in view of (15) and the definition of Z ), which implies that P( x, Y, Z ) ≡ 0 (it is not the all-zero polynomial). Also note P( x, y, z) = 0
∀( x, y, z) ∈ P such that x ∈ X
(16)
We associate with such polynomial P the set of curves CX , P defined as follows. Let X = { x j1 , x j2 , . . . , x j|X | }; we say that ( x j1 , α1 , β1 ), ( x j2 , α2 , β2 ), . . . , ( x j|X | , α|X | , β|X | ) ∈ CX , P if and only if P( x ji , αi , βi ) = 0 for all i = 1, 2, . . . , |X |. Note that degtot P( x, Y, Z ) W deg P/(k−1) for all x ∈ X . Therefore, if W deg P ∆, then for each x ∈ X , there are at most q∆/(k− 1) pairs (α , β) in Fq2 such that P( x, α , β) = 0, by the Schwartz lemma [4]. With reference to Figure 2, it follows that
q∆ |X | C (17) X ,P k−1 Now let P (X , ∆) denote the set of all polynomials of weighted degree at most ∆ in Fq [ X, Y, Z ] which satisfy (16) along with P( x, Y, Z ) ≡ 0 for all x ∈ X . Further, let us define the corresp onding set of curves C (X , ∆) = P∈P (X ,∆) CX , P . Then
q∆ t−|Z | N(∆) q (18) |C (X , ∆)| k−1 which follows by combining (17) with the fact that the total number of polynomials in P (X , ∆) is at most q N(∆) . If both
2029
ISIT 2006, Seattle, USA, July 9 14, 2006
B∆ and EZ occur, then P ∈ P (X , ∆) and the restriction of the random set P in (10) to X — namely, the set of points of P with X-coordinate in X — must belong to C (X , ∆). Hence q2|Z | |C (X , ∆)| Pr B∆ ∩ EZ q2t
(19)
We next consider a carefully chosen value of ∆. Specifically, motivated by the right-hand-side of (8), we set 3 ∆ = 6(k−1) 2 t − |Z | − k − δ n (20) where δ ε is a positive constant to be fixed later. For such ∆, the right-hand-side of (8) implies that
δn δ2 3δ 2 (21) − γ + 3 γ N (∆) − t + |Z | − t, |Z | t, |Z | 6 R2 R def 3 where γt,|Z | = 6 t −|Z | /(k− 1). In view of (14), we are concerned only with √ the case where |Z | < t − k. For such Z , we have γt,|Z | > 3 6, which in conjunction with (21) implies
√ δn δ2 3δ √ 3 3 − 6 + 3 36 (22) N (∆) − t + |Z | − 6 R2 R √ provided δ R 3 6. Finally, we observe that with the value of ∆ given by (20), we have n
n/3
6 ∆ t−|Z | 6t 3 (23) k−1 k−1 R Combining (19) with (18), (22), and (23), we arrive at the √ desired bound on Pr{ B∆ ∩ EZ }: for the ∆ in (20) with δ R 3 6 and for all Z with |Z | < t − k, we have √
n/3 δ n δ 2 3δ √ 3 3 −6 − 6 + 3 36 6 R R2 Pr B∆ ∩ EZ (24) q R To complete the proof, we reason as follows. Suppose that the event A¯ (t, ε) ∩ E Z has occurred, where |Z | < t − k. Then √ 3 6t(k−1) 2 − k − |Z | − εn (25) W deg P
k − 1, ∆U k − 1 √ √ 3 n 1 − R t n 1 − 6R2
t1 + t2 = t,
It turns out that the minimum in (30) always exceeds the lower bound W deg G1 s(k − 1) of Lemma 3. The difference between (30) and s(k − 1) is plotted in Figure 3 for all rates that satisfy (1). Since this difference is positive, we have arrived at a contradiction and (28) cannot hold. From this point on, we give only a brief sketch for the rest of the proof. From Theorem 2, we know that if some other Gr¨obner basis polynomial G in G(Pv ) does not have G ∗ as its factor, the MID algorithm will successfully recover the trans-
Now, let F (Pv ) denote the set of all elements of G(Pv ) that do not have G ∗ as a factor. Let G denote the minimal elemet of F (Pv ). Then we show that √ 3 (32) 6R2 + O( R5/3 ) W deg G n with probability at least 1 − n −Ω(n) provided the number t of channel errors satisfies (27). The proof of (32) is essentially based on the observation that if W deg G is too large, then the size of the delta-set of I(Pv ) becomes larger than n, in contradiction with (31). This is so because G1 can only “carve-out” a small number of monomials from the delta-set. Specifically, let ∆1 = W deg G1 and ∆ = W deg G. Then Lemma 4 implies that with high probability ∆ 1 3 6(k−1) 2 t − k. On the other hand, using the argument outlined above, we obtain 3 ∆ − ( ∆1 −k) + k ∆3 − n 6(k − 1)2 6(k − 1)2 Combining these two inequalities yields the bound (32) on ∆, which finally establishes Theorem 1. R EFERENCES [1] D. Bleichenbacher, A. Kiayias, and M. Yung, “Decoding of interleaved Reed-Solomon codes over noisy data,” Lect. Notes Computer Sci., 2719, pp. 97–108, 2003. [2] A. Brown, L. Minder, and M.A. Shokrollahi, “Improved decoding of interleaved AG codes,” Lect. Notes Computer Sci., 3796, pp. 37–46, 2005. [3] D. Coppersmith and M. Sudan,“Reconstructing curves in three (and higher) dimensional spaces from noisy data,” Proc. 35-th ACM Symp. Theory of Computing (STOC), pp. 136–142, San Diego, CA., June 2003. [4] D. Cox, J. Little, and D. O’Shea, Ideals, Varieties, and Algorithms, Berlin: Springer-Verlag, 1996. [5] V. Guruswami and A. Rudra, “Explicit capacity-achieving list-decodable codes,” Proc. 38-th Annual ACM Symp. Theory of Computing (STOC), pp. 1–10, Seattle, WA., 2006. [6] V. Guruswami and M. Sudan, “Improved decoding of Reed-Solomon and algebraic-geometric codes,” IEEE Trans. Inform. Theory, 45, pp. 1755– 1764, September 1999. [7] J. Ma, P. Trifonov, and A. Vardy, “Divide-and-conquer interpolation for list decoding of Reed-Solomon codes,” Proc. IEEE Symp. Information Theory (ISIT), Chicago, IL., July 2004. [8] R.J. McEliece, “The Guruswami-Sudan decoding algorithm for ReedSolomon codes,” Interplanetary Network Progress Report, 42-135, Jet Propulsion Lab., NASA, May 2003. [9] M. Sudan, “Decoding of Reed-Solomon codes beyond the error correction bound,” J. Complexity, 12, pp. 180–193, March 1997. [10] F. Parvaresh and A. Vardy, “Multivariate interpolation decoding beyond the Guruswami-Sudan radius,” Proc. 42-nd Annual Allerton Conf. on Communications, Control and Computing, Urbana, IL., October 2004. [11] F. Parvaresh and A. Vardy, “Correcting errors beyond the GuruswamiSudan radius in polynomial time,” Proc. 46-th Annual IEEE Symp. Foundations of Computer Science (FOCS), Pittsburgh, PA., October 2005. [12] F. Parvaresh and A. Vardy, “Multivariate interpolation decoding of ReedSolomon codes,” preprint, May 2006.
2031