Cédric Tavernier is with National Knowledge Center (NKC-EAI), Abu-. Dhabi, UAE; (e-mail:
). w
1
Soft-decision list decoding of Reed-Muller codes with linear complexity Ilya Dumer, Grigory Kabatiansky, and C´edric Tavernier
Abstract—Let a binary Reed-Muller code RM(s, m) of length n = 2m be used on a memoryless channel with an input alphabet ±1 and a real-valued output R. We consider a received vector y in Rn andPdefine its generalized distance T to any codeword c as the sum |yj | taken over all positions j, in which vectors y, c have opposite signs. For soft-decision decoding in Rn , we consider the list LT of codewords located within distance T from the received vector y. We then apply the generalized Johnson bound to estimate the size LT of this list. For any RM code RM(s, m) of fixed order s, the proposed algorithm performs list decoding beyond the error-correcting radius and retrieves the code list LT with linear complexity of order nLT .
I. I NTRODUCTION Preliminaries. Reed-Muller (RM) codes RM (s, m) - defined by two integers m ≥ s ≥ 0 - have length n, dimension k, and distance d as follows: n = 2m ,
k=
s X
m−s (m . i ), d = 2
i=0
RM codes have been extensively studied since 1950s thanks to their simple code structure and fast decoding procedures. In particular, majority decoding proposed in the seminal paper [5] has complexity order at most nk. Even a lower complexity order of n min(s, m − s) is required for recursive techniques [8], [9]. These algorithms also correct many errors beyond the error-correcting radius of d/2. In particular, for long RM codes RM (s, m) of any given order s, the majority decoding (see [6]) and the recursive technique [9] correct most error patterns of weight T = n(1 − ε)/2 or less for any ε > 0. However, these algorithms depend on a specific error pattern of weight T . As a result, these algorithms cannot necessarily output the complete list of codewords located within any radius T ≥ d/2, with the exception of the recent results [2], [3], [4] that have complexity O(n lns−1 n) or above. In this talk, we reduce the complexity order of n lns−1 n to the linear order O(n) and also extend our previous results to an arbitrary memoryless semi-continuous channel. n We assume that the codewords are taken from {±1} and a received vector z belongs to the Euclidean space Rn . Then Ilya Dumer is with the Department of Electrical Engineering, University of California, Riverside, CA 92521, USA; (e-mail:
[email protected]). Grigory Kabatiansky is with the Institute for Information Transmission Problems, Moscow 101447, Russia and INRIA, Rocquencourt, France; (email:
[email protected]). Research supported in part by RFFI grants 06-01-00226 and 06-07-89170 of the Russian Foundation for Fundamental Research. C´edric Tavernier is with National Knowledge Center (NKC-EAI), AbuDhabi, UAE; (e-mail:
[email protected]).
we consider the vector y = (y0 , ..yn−1 ) of log likelihoods yj = ln
Pr{+1|zj } Pr{−1|zj }
Here we define an error cost ωj = |yj | in each position j. In the sequel, all sums over integers j will be taken in the entire range [0, n − 1] if not stated otherwise. Without loss of we scale vector y to the same squared length P generality, n 2 y = n as that of any binary vectorPin {±1} . We also j j say that y has generalized weight W = j ωj . Given y ∈ Rn n and any vector c ∈ {±1} , consider the set J(y, c) = {j : yj cj < 0} of positions, where vectors y, c have opposite signs. Then we introduce the generalized Hamming distance X D(y, c) = ωj . (1) J(y,c)
Equivalently, D(y, c) = W/2 − hy, ci/2, where ha, bi = P a b denotes the inner product of any two real valued vecj j j tors a, b. Clearly, the input c with the smallest distance D(y, c) yields the vector with the maximum posterior probability. In the sequel, all operations with real numbers will be counted in bytes. Our main result - valid for general semicontinuous channels - is as follows. Theorem 1: Consider Reed-Muller code RM(s, m) of a fixed order s and a received vector y ∈ Rn of generalized weight W. Let p Tmax = W/2 − (n − 2d)n/2. (2) Then for any generalized distance T < Tmax , RM code C can be decoded into the code list LT ;C (y) = {c ∈ C : D(y, c) ≤ T }
(3)
with complexity 2m+s LT min(s, m − s), where LT =
2dn 2
(W − 2T ) − (n − 2d)n
(4) .
(5)
Note that LT serves as the Johnson bound on the maximum size of the list LT ;C (y). Next, consider any binary symmetric channel. In this case, LT ;C (y) includes all vectors c ∈ C within the Hamming distance d(y, c) ≤ T from vector y. Also, W = n. Then Theorem 1 gives a linear-complexity algorithm with a maximum correcting capability Tmax that exceeds d/2 for any fixed s.
2
II. D ETERMINISTIC LIST DECODING FOR CODES RM (s, m) We first revisit the Johnson bound on the maximum size of the code list (3) in the Hamming metric d(y, c). Let T = θn and C be a code of the minimum Hamming distance d ≥ δn such that δ > 2θ(1 − θ). Then |LT ;C (y)| ≤
δ . δ − 2θ(1 − θ)
n
Lemma 2: Let C be an (n, d)-code in {±1} , y ∈ Rn be the received vector of generalized weight W , and let Tmax be defined in (2). Then the list LT ;C (y) = {c ∈ C : D(y, c) ≤ T } = {c ∈ C : hy, ci ≥ W − 2T } Remark. According to [10], code RM(s, m) has
i=0
2m−i − 1 −1
2m−s−i
codewords of minimum weight d = d1 . Also, for s ≥ 2 the next nonzero weight d2 in RM(s, m) satisfies inequality d2 ≤ 2m−s+1 − 2m−s−1 . These estimates give for RM codes with s ≥ 2 a tighter bound |LT ;C (y)| ≤
with symbols ±1. Our goal is to construct the list LT ;s;m (y) = {f ∈ RM (s, m) : hy, fˆi ≥ W − 2T }.
2n(d2 + A(m, s)(d2 − d)) 2
(W − 2T ) − (n − 2d2 )n
f (i) (x1 , · · · , xm ) ≡
i X
xj fj (xj+1 , · · · , xm )
in representation (8). This list is defined as follows. In step i, we represent each position (x1 , ..., xm ) in the form (x, α), where x ∈ {0, 1}i and α ∈ {0, 1}m−i . Given two vectors y(x, α), f (x, α), we consider the function ¯ ¯P P ¯ ¯ Si (f, y) = α∈{0,1}m−i ¯ x∈{0,1}i hy(x, α), fˆ(x, α)i¯ ¯P ¯ P ¯ ¯ = α∈{0,1}m−i ¯ x∈{0,1}i y(x, α)(−1)f (x,α) ¯ Note that for any f ∈ LT ;s;m (y), Si (f, y) ≥ hy, fˆi ≥ W − 2T (i)
(7)
Proof. For brevity, let L denote the list LT ;C (y) and L be the size of L. We use the Cauchy-Schwartz inequality hy, bi2 ≤ P hy, yihb, bi for vectors y and b = c∈L c : ® ®2 P P P y, c∈L c ≤ hy, yi c∈L c c∈L c, Recall that y has squared length hy, yi = n. Also, by construction of the list L, ®2 P 2 y , c∈L c ≥ L2 (W − 2T ) . On the other hand, given a code C of distance d, we have inequality P ® P c∈L c , c∈L c ≤ Ln + L(L − 1)(n − 2d). The above inequalities immediately give bound (5). Bound (7) can be proven similarly, by using inequality P ® P c∈L c, c∈L c ≤ Ln + A(m, s)(n − 2d)L +L(L − 1 − A(m, s))(n − 2d2 ). 2 A binary Reed-Muller code RM (s, m) ⊂ Fn2 of order s consists of vectors f =(..., f (x1 , ..., xm ), ...), where X f (x1 , ..., xm )= xi fi (xi+1 , · · · , xm ) 1≤i≤m−s
+ fm−s+1 (xm−s+1 , · · · , xm )
The recursive algorithm ψ(s, m, T ) discussed below extends our former algorithm ψ(1, m, T ) of [1] and performs m − s (i) steps. In step i, we construct the intermediate list LT (y) that includes (but is not limited to) the first i terms
j=1
of generalized distance T < Tmax satisfies bound (5). m−s−1 Y
fˆ = (..., (−1)f (x1 ,...,xm ) , ...)
(6)
Following [1], we extend the Johnson bound to the generalized distance D(y, c) and define the list size used in Theorem 1.
A(m, s) = 2s
Here fi (xi+1 , · · · , xm ) ∈ RM (s − 1, m − i). Given vector f with symbols f (x1 , ..., xm ), we also consider the vector
(8)
Also, (−1)f (x,α) = c(−1)f (x,α) , where c = ±1 is the same constant for any given α. Thus, Si (f, y) = Si (f (i) , y). In step i, our algorithm ψ(s, m, T ) will construct the intermediate list (i)
LT (y) = {f (i) ∈ RM (s, m) : Si (f (i) , y) ≥ W − 2T } (9) We say that list (9) satisfies the Sums criterion. However, we cannot employ Lemma 2 to bound the size of this list, since the inner product hy, fˆi is now replaced with the (less restrictive) Sums function Si (f (i) , y). Therefore, we will also (i) address the corresponding bounds for LT (y) via the Sums function. This will be done in Theorems 3 and 4. III. T HE M AIN S TEPS OF OUR A LGORITHM In step i = 1, ..., m − s, our algorithm ψ(s, m, T ) takes (i−1) (i) the list LT (y) and outputs the next list LT (y). Here we (i−1) extend any polynomial f (i−1) ∈ LT (y) into a polynomial f (i) (x1 , · · · , xm ) =f (i−1) (x1 , · · · , xm )+xi fi (xi+1 , · · · , xm ) that satisfies criterion (9). We will avoid a straightforward approach that tests each element fi (xi+1 , · · · , m) ∈ RM (s − 1, m − i). To do so, for any retrieved element f (i−1) , we will construct a new received vector yi (f (i−1) ) of length 2m−i and decode it with our recursive algorithm ψ(s − 1, m − i, T ). (i) This will allow us to construct the list LT (y) much more efficiently. The main steps of this algorithm are as follows.
3 (0)
Inputs : s, m, y, LT (y) = {∅} ψ(s, m, T )(y): { for i = 1 to i = m − s do (i)
LT (y) = ∅; (i−1)
for f (i−1) ∈ LT
(y) do
Construct yi = yi (f (i−1) ); Compute L = ψ(s − 1, m − i, T )(yi ) using the Sums criterion (9); for l(xi+1 , · · · , xm ) ∈ L do © ª (i) LT (y) = ∪ f (i−1) + xi l(xi+1 , · · · , xm ) ; end for; end for; end for; (m)
return(LT (y)) } IV. P ROOF OF C ORRECTNESS The following theorem will be used to bound the size of any intermediate list obtained via the Sums criterion. Theorem 3: For any i ∈ [1, m] consider a binary vector Q(x1 , ..., xm ) =
i X
xj Rj (xj+1 , · · · , xm )
j=1
from RM(s, m) that is not a repetition vector 0n , 1n . Then ˆ 1 , ..., xm ) = (−1)Q(x1 ,...,xm ) satisfies inequality vector Q(x ¯ ¯ ¯ ¯ ¶ µ X ¯ X ¯ ˆ α)¯ ≤ 2m+1 1 − 1 ¯ (10) Q(x, ¯ ¯ 2 2s ¯ ¯ m−i i α∈{0,1} x∈{0,1} Proof. We first use induction over the degree s of RM(s, m). Let s= 1. For any fixed α ∈ {0, 1}m−i , vector Q(x, α) is the output of a non-constant Boolean linear function. Then ˆ α) has equal numbers of symbols ±1 and vector Q(x, P ˆ x∈{0,1}i Q(x, α) = 0, which proves the theorem for s = 1. Now let the theorem hold for all 1 ≤ r ≤ s − 1. To proceed with r = s, we use induction over the number of variables m. For m ≤ s − 1, the theorem holds since Q ∈ RM (r, m) for some r ≤ s − 1. Now let the theorem hold if the number of variables j ≤ m − 1. For j = m, we can write Q(x1 , ..., xm )= T (x1 , · · · , xi−1 , xi+1 , · · · , xm ) + xi R(x1 , · · · , xi−1 , xi+1 , · · · , xm ) where T ∈ RM (s, m − 1) and R ∈ RM (s − 1, m − 1). Then for any α, X X ˆ α)) ˆ α)= (Tˆ(x, α) + Tˆ(x, α)R(x, Q(x, x∈{0,1}i
x∈{0,1}i−1
and ¯P ¯ ¯ ¯ ¯ ¯ ˆ α)¯¯≤¯¯P ˆ T (x, α) ¯ x∈{0,1}i Q(x, ¯ x∈{0,1}i−1 ¯P ¯ ¯ ˆ α)¯¯ (11) + ¯ x∈{0,1}i−1 Tˆ(x, α)R(x,
ˆ α) Let us assume that both vectors Tˆ(x, α) and Tˆ(x, α)R(x, n/2 are not repetition vectors ±1 . Then it is easy to verify that inequality (10) follows from (11) by induction from m − 1 to m. If Tˆ = ±1n/2 , we assume that R is not equal to ±1n/2 (otherwise we have the case s = 1). Then for any R ∈ RM (s − 1, m − 1), ¯P ¯ P ¯ ˆ α)¯¯ Q(x, ¯ m−i i α∈{0,1} x∈{0,1} ¶ µ ¶ µ 1 1 1 1 m−1 m m+1 ≤2 − − ≤2 +2 2 2s−1 2 2s ˆ in (11) be a repetition vector ±1n/2 . Then Finally, let TˆR deg(T ) ≤ s − 1, since deg(R) ≤ s − 1. Thus, we again obtain (10) by induction. 2 Theorem 4: Consider a code RM (s, m) and a received vector y. Let Tmax be defined by (2). Then for any T < Tmax and any 1 ≤ i ≤ m, the list (9) of code vectors has size L ≤ LT bounded by (5). Proof. For brevity, let E denote our list (9).For each f ∈ E and each α ∈ {0, 1}m−i , we consider a Boolean function c(f, α) such that c(f, α) = 0 if and only if P f (x,α) ≥ 0. x∈{0,1}i y(x, α)(−1) In other words, c(f, α) inverts the values of f (x, α) on any facet α ∈ {0, 1}m−i if hy(x, α), fˆ(x, α)i < 0. Let f 0 (x, α) = f (x, α) + c(f, α). Then fˆ0 (x, α) = ±fˆ(x, α),
(12)
since cˆ(f, α)= ±1 for any facet α ∈ {0, 1}m−i . Also ¯ ¯ ¯ ¯ X ¯ X ¯ f (x,α) ¯ ¯ y(x, α)(−1) ¯ ¯ ¯ α∈{0,1}m−i ¯x∈{0,1}i X X f (x,α)+c(f,α) = y(x, α)(−1) = hy, fˆ0 i. α∈{0,1}m−i x∈{0,1}i
Then the Cauchy-Schwartz inequality gives ° °2 P ° 2 °P hy, f ∈E fˆ0 i2 ≤ kyk ° f ∈E fˆ° 2
Here kyk = n. By construction of the list E, P 2 hy, f ∈E fˆ0 i2 ≥ L2 (W − 2T ) . Also,
°P °2 P ° °2 ° ° ° ° ° f ∈E fˆ0 ° = f ∈E °fˆ0 ° + X X P fˆ0 (x, α)ˆ g 0 (x, α) ≤ + f,g∈E f 6=g
Ln +
P f,g∈E f 6=g
α∈{0,1}m−i x∈{0,1}i
P α∈{0,1}m−i
¯P ¯ ¯ ¯ g (x, α)¯ ¯ x∈{0,1}i fˆ(x, α)ˆ
In the last inequality we also used (12). Note that the vector ˆ =fˆgˆ cannot be a repetition vector ±1n (otherwise the list Q E includes two opposite vectors). Then Theorem 3 gives ¯ ¯ ¯ ¯ X ¯ X ¯ ˆ α)¯ ≤ n − 2d. ¯ Q(x, ¯ ¯ ¯ α∈{0,1}m−i ¯x∈{0,1}i
4
i begins with two vectors yu (α, f (i−1) ). Then for each f (i−1) we obtain the list of extensions {fi (α)} . Given f (i−1) and fi (α) , we calculate the product vector
Thus, 2
L2 (W − 2T ) ≤ n(Ln + L(L − 1)(n − 2d)) 2
which proves the theorem. V. R ECURSIVE C OMPUTATIONS AND C OMPLEXITY
Our next goal is to address the computations performed in each step i of our algorithm. Given any accepted function f (i−1) (x1 , · · · , xm ), we need to find its extension f (i) =f (i−1) + xi fi (α) that meets condition (9). Here α = (xi+1 , · · · , xm ). We now rewrite condition in (9) as X |Γi (α)| ≥ W − 2T. (13) α∈{0,1}m−i
Here we use the function X (i−1) (x,xi ,α)+xi fi (α) y(x, xi , α)(−1)f Γi (α) = x∈{0,1}i−1 xi =0,1
= y0 (α, fˆ(i−1) ) + fˆi (α)y1 (α, fˆ(i−1) ), where for each u = 0, 1, yu ≡ yu (α, f (i−1) ) =
X
y(x, u, α)fˆ(i−1) (x, u, α)
x∈{0,1}i−1
(14) Thus, for each f (i−1) we only need two quantities yu (α, f (i−1) ) to estimate the function Γi (α) and choose the corresponding extension fi (α) . To optimize complexity of step i, for each f (i−1) , we introduce vectors µ(α, f (i−1) ) = min{|y0 | , |y1 |}, µ ¯(α, f (i−1) ) = max{|y0 | , |y1 |} Also, let h(α, f (i−1) ) ∈ {±1} denote the sign of y0 y1 . Then it is readily verified that |Γi (α)| = µ ¯(α, f (i−1) ) + µ(α, f (i−1) )h(α, f (i−1) )fˆi (α).
(15)
of length 2m−i and rewrite condition (13) as X X y(α, f (i−1) ) fˆi (α) ≥ W − 2T − µ ¯(α, f (i−1) ) (16) α
(17)
and use filtering condition (16). Note that this product vector y(α, f (i) ) can be directly used in step i + 1. In particular, for any α0 = (xi+2 , · · · , xm ), we obtain the two new vectors (14) yu (α0 , f (i) ) = Y (u, α, f (i) ). Thus, the complexity of transition from step i − 1 to step i includes: A. finding at most LT vectors y(α, f (i−1) ); B. LT procedures ψ(s − 1, m − i, T (f (i−1) )); C. finding vectors Y (α, f (i) )for each f (i−1) and fi (α). Note that procedure A requires for each vector y(α, f (i−1) )of length 2m−i only 2m−i+1 operations with two vectors yu (α, f (i−1) ). To estimate the total complexity of procedure C, we use the following lemma. Lemma 5: For each step i = 1, ..., m−s and for each prefix f (i−1) , the recursive procedure ψ(s − 1, m − i, T (f (i−1) )) outputs at most 2s suffixes fi (α). Thus, procedure C requires only 2m−i operations for each vector Y (α, f (i) ) and at most 2m−i+s operations in total. Finally, note that all procedures ψ(s − 1, m − i, T ) for all i = 2, ..., m − s can be considered as parts of one procedure ψ(s, m − 1, T ). Thus, the entire complexity ψ(s, m, T ) can be recursively recalculated through: -performing ψ(s − 1, m − 1, T ); -finding vectors y(α, f (0) ) and Y (α, f (1) ); -performing ψ(s, m − 1, T ); Thus, decoding satisfies complexity estimates |ψ(s, m)|≤ |ψ(s − 1, m − 1)| + m+s
|ψ(s, m − 1)| + LT 2
We also have trivial “boundary” estimates |ψ(s, m)| ≤ 2m for s = 0, m. In this case, it is readily verified that
Thus, for each f (i−1) , we can use a new vector y(α, f (i−1) ) = µ(α, f (i−1) )h(α, f (i−1) )
Y (α, f (i) ) = y(α, f (i−1) )fˆi (α)
|ψ(s, m)| ≤ 2m+s LT min{s, m − s}. This completes the proof of Theorem 1.
2
α
Thus, we replace condition (13) with an equivalent condition (16) that employs the new vector y(α, f (i−1) ) and the new threshold X µ ¯(α, f (i−1) ) T (f (i−1) ) = W − 2T − α
Then in step i, we decode each vector y(α, f (i−1) ) with our algorithm ψ(s − 1, m − i, T (f (i−1) )). According to Theorem 4, the two equivalent conditions (13), (16) yield no more than LT candidates f (i) = f (i−1) +xi fi (α) . Thus, we can call our recursive decoding procedure ψ(s − 1, m − i, T (f (i−1) )) using the updated vectors y(α, f (i−1) ) and thresholds T (f (i−1) ). Now we can proceed with complexity bounds and prove Theorem 1. Note that the transition from the step i − 1 to step
VI. C ONCLUDING R EMARKS The above algorithm performs list-decoding within the generalized Johnson bound on a memoryless channel and achieves linear complexity in blocklength n for RM codes of fixed order s. In this regard, we also mention a recent breakthrough of [2], which gives a list decoding algorithm that performs within code distance d = 2m−s of an RM code of any fixed order s. Namely, it is shown that such a code can be list decoded within radius d(1−ε) for an arbitrarily small ε with polynomial complexity of order O(n3 ). A slight modification [4] of this list-decoding algorithm reduces this order to O(n lns−1 n). An important open problem is to obtain list decoding within code distance d with linear complexity O(n) for any RM code of fixed order s.
5
R EFERENCES [1] I. Dumer, G. Kabatiansky, and C. Tavernier, “List decoding of biorthogonal codes and the Hadamard transform with linear complexity,” IEEE Trans. Info. Theory, vol. 54, 4488-4492, 2008. [2] P. Gopalan, A.R. Klivans, and D. Zuckerman, “List-decoding ReedMuller codes over small fields,” Proc. 40th ACM Symp. Theory Computing (STOC), pp. 265–274, 2008. [3] I. Dumer, G. Kabatiansky, and C. Tavernier, “List decoding of ReedMuller codes up to the Johnson bound with almost linear complexity, ” Proc. 2006 ISIT, Seattle, USA, June 2006. [4] I. Dumer, G. Kabatiansky, and C. Tavernier “On complexity of decoding Reed-Muller codes within their code distance”, Proc. 11th Intern. Workshop Algebr. Comb. Coding Theory (ACCT), Pamporovo, Bulgaria, pp. 82–85, June 2008. [5] I.S. Reed, “A class of multiple error correcting codes and the decoding scheme,” IEEE Trans. Info. Theory, vol. IT-4, pp. 38-49, 1954. [6] R.E. Krichevskiy, “On the Number of Reed-Muller Code Correctable Errors,” Dokl. Soviet Acad. Sciences, vol. 191, pp. 541-547, 1970. [7] F.J. MacWilliams and N.J.A. Sloane, “The Theory of Error-Correcting Codes,” North-Holland, Amsterdam, 1981. [8] G. A. Kabatianskii, “On decoding of Reed-Muller codes in semicontinuous channels, ” Proc. 2nd Int. Workshop “Algebr. and Comb. Coding theory” Leningrad, USSR, pp. 87-91, 1990. [9] I. Dumer, “Recursive decoding and its performance for low-rate ReedMuller codes”, IEEE Trans. Inform. Theory, vol. 50, pp. 811-823, 2004. [10] T. Kasami and N. Tokura,”On the weight structure of Reed-Muller codes, IEEE Trans. Info. Theory, vol. 16, pp. 752-759, 1970.