ISIT 2006, Seattle, USA, July 9 - 14, 2006
On the minimum distance of structured LDPC codes with two variable nodes of degree 2 per parity-check equation ∗ INRIA
Jean-Pierre Tillich∗ , Gilles Zémor
†
Rocquencourt Projet Code, Domaine de Voluceau, Le Chesnay 78153 Cedex, France Email:
[email protected] † ENST,46 rue Barrault75634 Paris 13, France Email:
[email protected]
Abstract— We investigate the minimum distance of structured LDPC codes with two variable nodes of degree 2 per parity-check equation and show that their minimum distance is a sub-linear power function of the code length.
I. I NTRODUCTION It is well known that in the design of LDPC codes with high iterative decoding performance the variable nodes of degree 2 play a very important role. On the one hand, the bit error rate after decoding is significantly higher for these variable nodes (see for instance [12]) and raising their proportion strengthens error floor phenomena (see for example [9]). On the other hand, for iterative decoding to work close to capacity, a large fraction of degree 2 variable nodes seems necessary, so much so that the two conflicting goals, approaching capacity and achieving low error floors, may be unreconcilable1 . To illustrate, on the erasure channel all capacity achieving sequences require the fraction of degree 2 nodes to be nonzero [11]. More specifically, all known sequences of LDPC codes that achieve the capacity of the erasure channel satisfy the technical flatness condition introduced in [11]. This condition implies in particular: 1 λ2 ρ → (1) p where p is the erasure channel probability, λ2 is the fraction of edges incident to variable nodes of degree 2, and ρ + 1 is the average, over all edges of the Tanner graph, of the degree of the edge’s neighboring check node. Turning to error floor phenomena, it is desirable to have some control over the code’s minimum distance. Note that randomly chosen LDPC codes are known [1] to have, with probability lowerbounded by a non-zero constant, a minimum distance which grows linearly in the code length if and only if λ2 ρ < 1. Though this condition involves less degree 2 nodes than condition (1), it does not preclude, when λ2 ρ > 1, the existence of LDPC codes with a minimum distance linear in the block length n; however it does mean that these LDPC 1 It should be noted that the same problem has also been observed for other families of codes which have good iterative decoding performances, multiedge type LDPC codes [10], turbo-codes, etc.
1424405041/06/$20.00 ©2006 IEEE
codes are not accessible through pure random choice methods, but must be structured in some way. A popular way of choosing this structure is to purposefully avoid all cycles in the Tanner graph involving only variable nodes of degree 2. This is because the characteristic vectors of these cycles are codewords, and these tend to have low weight if random choice is applied fully. Note that this is only possible if the number n2 of variable nodes of degree 2 is strictly less than the number m of parity check equations. In this paper we shall study the behavior of the minimum distance of structured LDPC codes for which we have exactly n2 = m. This condition is critical in several ways and the motivation for this study is threefold. First, as the preceding discussion illustrates, it should be desirable from the point of view of the close-to-capacity performance of iterative decoding to have the largest possible proportion of degree 2 variable nodes. We notice in particular that, for code rates R = 1/2 and large values of ρ, condition (1) translates to n2 /m → 1 when the blocklength n goes to infinity. Secondly, 1 is a critical value of the proportion n2 /m for the minimum distance. Indeed, it can be shown fairly easily that if n2 /m stays above a fixed quantity strictly larger than 1, then the minimum distance of the corresponding LDPC code cannot be larger than a logarithmic function of n. Conversely, if n2 /m is taken to equal a fixed quantity less than 1, then the minimum distance can be made to behave like a linear function of n. Thirdly, when n2 = m a simple linear-time encoding algorithm can be exhibited by arranging the set of degree 2 nodes into a single cycle [12], [6]. Finally, note that many good families of LDPC codes which show good iterative decoding performances have a number n2 of degree 2 variable nodes which is equal or close to m, see for instance [12], [3]. The specific family of LDPC codes we consider is the ensemble of codes with Tanner graph chosen to have all its m check nodes of degree d, m variable nodes of degree 2 that lie on a single cycle, and all other variable nodes of constant degree c. Equivalently, the codes have a parity-check matrix of the form
1549
H = [Cm | M]
(2)
ISIT 2006, Seattle, USA, July 9 - 14, 2006
where Cm is the m × m matrix ⎡ 1 ⎢1 1 ⎢ ⎢ 1 1 ⎢ Cm = ⎢ ⎢ ⎢ ⎣
..
. 1 1
We also denote in the whole paper by K (and sometimes also by K ) a universal constant, which may vary between occurrences.
⎤ 1 ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ 1
and where M is a m × (n − m) matrix such that all of its columns have weight c and all the rows of H have weight d. The constant degree assumption is made in order to simplify the proofs, as in [4], but we hasten to say that our results can be generalized to other degree distributions where the minimum degree of variable nodes of degree > 2 would play the same role as c. Our main result states: Theorem 1: The structured LDPC code with parity check matrix (2) has a minimum distance dmin that satisfies, for large blocklength n: (i) When M is chosen at random uniformly, 2
2
P(dmin ≤ αn1− c ) = O(α 2 ) + O(n c −1 ) for even c P(dmin ≤ αn
1− 2c
c
) = O(αc ) + O(n
2 c −1
) for odd c
(ii) dmin is always smaller than a quantity of order O(n1−1/c ). We note that this behavior is reminiscent of that of serial turbocodes [5], [7], [8] and that it gives some theoretical evidence to the often quoted experimental observation, see for instance [12], that for this kind of structure it is better to choose larger values of c to avoid low error-floors. It should be noted that the first point of Theorem 1 can be refined to show that the typical minimum distance of such codes is really of order O(n1−2/c ). This yields√a mimimum distance of order only n1/3 for c = 3 and order n for c = 4. II. N OTATION AND BASIC DEFINITIONS We follow here standard notation and terminology for LDPC codes. The blocklength and the minimum distance will always be denoted by n and dmin respectively. As explained in the introduction, we consider the structured LDPC code ensemble of codes defined by a parity-check matrix of the form (2). The number m of parity checks is therefore also the number of degree 2 variable nodes and n−m is the number of degree c variable nodes. By counting in two different ways the number of edges of the Tanner graph of the LDPC code, we obtain that 2m + c(n − m) = dm. From this we obtain that c m= n (3) d−2+c d−2 n−m= n. (4) d−2+c To an arbitrary codeword x we associate a particular subgraph of the Tanner graph that will play an important role in the discussion which follows. We call this subgraph the sugraph of degree 2 associated to x (we drop the reference to x when no ambiguity can occur). Its vertices are the variable nodes of degree 2 which belong to the support of x together with the parity checks which involve these variable nodes.
III. T HE AVERAGE SPECTRUM We are going to prove the first point of Theorem 1. For this purpose, we assume that the Tanner graph of our code is chosen in the following way. The part of the Tanner graph which corresponds to the variable nodes of degree 2 is built from matrix Cm . To define the part which corresponds to variable nodes of degree > 2, we associate to each check node, d − 2 slots and to each variable node of degree c, c slots. Let s be the total number of such slots. We label the slots which are associated to the variable nodes by v1 , v2 , . . . , vs and the slots associated to the check nodes by c1 , . . . , cs . Note that s = (n − m)c = m(d − 2). We choose a random permutation π of s elements and put an edge between a variable node and a check node, iff there is a slot vi attached to this variable node and a slot cj attached to the check node such that j = π(i). Note that this graph may have multiple edges. But it is not hard to prove that with probability lower-bounded by some positive constant, the corresponding graph corresponds to a code with parity-check matrix H of the form specified in the introduction. Therefore if we prove that some event occurs with large probability for our random model, it will also occur with large probability for the model where H is chosen at random uniformly among all matrices with the form specified in the introduction. Let a1···t be the expected number of non-zero codewords of weight less than or equal to t. We are going to use the straightforward upper-bound P(dmin ≤ t) ≤ a1···t .
(5)
The first point in Theorem 1 is a consequence of this inequality and the bound on a1···t given in Proposition 5. To bound a1···t we will first calculate ai,j,k which denotes the expected number of codewords of weight i which involve exactly j variable nodes of degree c whose subgraph of degree 2 nodes has k connected components. Let buv be the number of binary vectors (x1 x2 . . . x(d−2)m ) of weight v in the code of length (d − 2)m defined by the set of parity check equations : x1+(i−1)(d−2) + · · · + xi(d−2) = 0, i ∈ {1, . . . , m − u} x1+(i−1)(d−2) + · · · + xi(d−2) = 1, i ∈ {m − u + 1, . . . , m}. We are now ready for the following statement Lemma 2: i−j−1m−i+j−1 2k bcj m n−m j k−1 k−1 . ai,j,k = (d−2)m k cj
(6)
The proof of this lemma wil be given in the full version of this paper and uses calculations very similar to [4]. We bound these quantities with the following lemma and fact def Lemma 3: Let w = v − u. Then
w 1 m(d − 2)(d − 1) 2 buv ≤ (d − 2)u ew(1+ m/w−1 ) w
1550
ISIT 2006, Seattle, USA, July 9 - 14, 2006
The proof of this statement is provided in the appendix. For the binomial coefficients we use the following bounds which follow fairly straightforwardly from Stirling’s approximation, see [2, §II.9]. Fact 1: There exist two constants K and K such that 2
t − m−t
Ke
r
“ me ”t m ≤ t(m − t) t
m t
!
r ≤K
There is also the simpler upper-bound def t Let aj,k (t) = i=1 ai,j,k . def
m t
“ me ”t m t(m − t) t
≤
me t t
.
def
1−2/c Lemma 4: Let ∆ = cj . Then there 2 − k and t = αm exists some constant K (which depends only on c and d) such that aj,k (t) ≤ αk K j m(2/c−1)∆ (cj)∆−j . From this bound, it is straightforward to deduce the following def Proposition 5: For t = αm1−2/c , where α ≤ 1, we have
a1···t = O(αc/2 ) + O(m−(1−2/c) ) for even c a1···t = O(αc ) + O(m−(1−2/c) ) for odd c Proof: We start the proof by noticing
proceed by considering the ratio uj+1 /uj :
∆−j cj + c K uj+1 /uj = cj + c cj
∆−j 1 K 1+ = cj + c j K (∆−j)/j ≤ e cj + c K 2/c−1 , e ≤ cj + c where the last inequality follows from j ≥ 2∆ c . For ∆ large enough (say greater than some ∆0 ), the right-hand term can be made smaller than a constant β < 1. The term u2∆/c = c−2 2∆ K c (2∆) c ∆ is handled by upper-bounding ∆ by ct/2, which is itself upper-bounded by αcm1−2/c /2 to obtain u2∆/c ≤ K
(αcm1−2/c ) c−2 2 ≤ K ∆ m( c ) ∆
def
where K = c
c−2 c
g(∆) ≤
g(∆),
(7)
def where g(∆) = j aj,cj/2−∆ (t). We use here the convention that aj,k (t) is equal to zero if k is not an integer. This is a consequence of the fact that 2k ≤ cj and cj is necessarily even when aj,k (t) is not equal to zero. Note that for ∆ = 0, the smallest j for which aj,cj/2−∆ (t) = 0 is equal to 1 for even c and 2 for odd c. For positive values of ∆, the smallest j of this kind is always greater than or equal to 2∆ c . This follows from the very definition of ∆ = cj/2 − k, therefore cj/2 is always greater than or equal to ∆. The term which dominates the sum in (7) will be seen to correspond to ∆ = 0 and is handled by (for even c) t
αcj/2 K j (cj)−j
j=1
≤ K αc/2
∞
(8)
−j
where K = j=1 K (cj) . When c is odd, the correspondingsum starts for j = 2 and we upper bound it by t g(0) = j=2 aj,cj/2 (t) ≤ K αc . Note also that for constant ∆ we have g(∆) ≤
t
j
2
t
u2∆/c β j−2∆/c m−(1−2/c)∆
j=2∆/c
≤
∆=0
g(0) ≤
c−2 c ∆
K c . From this deduce that for ∆ ≥ ∆0 :
ct/2
a1···t =
2∆ c
αcj/2−∆ K j m(2/c−1)∆ (cj)∆−j
j=2∆/c
≤ K m(2/c−1)∆ (9) ∞ where K = j=1 K j (cj)∆−j . We are now going to upper bound the sum g(∆) for ∆ larger than some constant which def j ∆−j will be specified later. For j ≥ 2∆ . We c , let uj = K (cj)
2(c−2) 1 K ∆ m− c ∆ 1−β
(10)
Using (8),(9) and (10) in (7) yields the proposition. Remark : The previous proof shows that the main contribution to the expectation a1···t corresponds to codewords whose support contains only a very small number j of positions of degree c and for which the subgraph of degree 2 has cj/2 connected components. IV. A N UPPER BOUND ON THE MINIMUM DISTANCE Suppose again that C is a structured LDPC code defined by a parity matrix of the form (2). To bound the minimum distance dmin of C we shall use the weight constraints on the columns of the matrix M but not on its rows. Let Vm,c be the set of binary vectors of length m and weight c. Let us introduce the following metric on Vm,c : for u, v ∈ Vm,c , define the distance dm,c (u, v) to be the minimum weight of a vector x ∈ {0, 1}m such that x tCm = u + v.
(11)
It is readily checked that dm,c (·, ·) is indeed a distance. Note now that if mi and mj are respectively the ith and jth columns of the matrix M we have the upper bound on the code distance: dmin ≤ dm,c (mi , mj ) + 2.
(12)
This is because, if x satisfies (11) then the vector of {0, 1}n of support supp(x) ∪ {m + i, m + j} is a codeword of C. Now let w be an integer less than (dmin −2)/2. For m ∈ Vm,c let us
1551
ISIT 2006, Seattle, USA, July 9 - 14, 2006
denote by Bm,c (m, w) the ball centered at m and of radius w, i.e. the set of vectors v of Vm,c such that dm,c (m, v) ≤ w. Inequality (12) implies that the balls Bm,c (m, w) must be disjoint when m ranges over the columns of the matrix M. Therefore, writing m ∈ M to mean that m is a column of M, we have,
m . (13) |Bm,c (m, w)| ≤ c m∈M
Next, we lower bound |Bm,c (m, w)| by a quantity independent of m. The key argument is that dm,c (m, v) ≤ w if and only if the supports of m and v make up the frontiers of c (possibly empty, not necessarily disjoint) connected components of the circle Z/mZ of total length not more than w. To be precise, dm,c (m, v) ≤ w if and only if there exist 2c not necessarily distinct integers a1 , . . . , a2c−1 , a2c of {1, 2, . . . , m} such that c−1 1) i=0 (a2i+1 − a2i+2 ) ≤ w where (a) denotes the smallest absolute value of an integer x such that a = x mod m. 2) The set {1, 2, . . . , 2c − 1, 2c} can be partitioned into two sets I and J of equal size c, such that supp(m) = {ai , i ∈ I} and supp(v) = {ai , i ∈ J}. Therefore, the cardinality of the ball Bm,c (m, w) equals at least the number of ways of choosing c non-negative integers 1 , 2 , . . . c such that 1 + 2 + · · · + c ≤ w. In other words, for any m ∈ Vm,c ,
w w+1 w+c i+c−1 . = |Bm,c (m, w)| ≥ c−1 c−1 c i=0
[3] X. Hu, E. Eleftheriou, and D.M. Arnold. Regular and irregular progressive edge-growth Tanner graphs. IEEE Trans. on Inform. Theory, 51(1):386–398, January 2005. [4] R. Ikegaya, K. Kasai, Y. Shimoyama, T. Shibuya, and K. Sakaniwa. Stopping set distributions of two-edge type ldpc code ensembles. In ISIT’2005, pages 985–989, September 2005. [5] N. Kahale and R. Urbanke. On the minimum distance of parallel and serially concatenated codes, 1997. preprint available on http://lthcwww.epfl.ch/ ruediger/papers/weight.ps. [6] D. J. C. MacKay. Information theory, inference and learning algorithms. Cambridge, 2003. [7] A. Perotti and S. Benedetto. An upper bound on the minimum distance of serially concatenated convolutional codes. In Proceedings of ISIT2004, page 314. IEEE, June 2004. [8] A. Perotti and S. Benedetto. An upper bound on the minimum distance of serially concatenated convolutional codes, 2004. preprint available at http://commgroup.polito.it/Papers/files/dmin-ub-sccc.pdf. [9] T. Richardson. Error floor of LDPC codes. In Allerton Conf. on Communicaion and Computing, Monticello, Illinois, Oct. 2003. [10] T. Richardson and R. Urbanke. Multi-edge type LDPC codes. preprint available on http://lthcwww.epfl.ch/papers/multiedge.ps, 2004. [11] A. Shokrollahi. Capacity-achieving sequences. In B. Marcus and J. Rosenthal, editors, Codes,Systems, and Graphical Models, number 123 in IMA vol. in Mathematics and Applications, pages 153–166. Springer, 2000. [12] M. Yang, W. E. Ryan, and Y. Li. Design of efficiently encodable moderate-length high-rate irregular LDPC codes. IEEE Trans. on Comm., 52(4):564–571, April 2004.
A PPENDIX Proof of Lemma 3:
buv xv = (xf (x))u g(x)m−v
v=0
where def (1
Together with inequality (13) this gives, since the matrix M is made up of n − m columns:
w+1 w+c m . (n − m) ≤ c c−1 c For fixed c and growing m and w we get therefore
First let us note that
m(d−2)
f (x) =
d−3 2 d − 2 2i + x)d−2 − (1 − x)d−2 x = 2i + 1 2x i=0
2 d − 2
+ x)d−2 + (1 − x)d−2 x2i = 2i 2 i=0 d−2
def (1
g(x) =
We now use the fact that any coefficient ai of a polynomial P with nonnegative coefficients can be bounded for any x > 0 by ai ≤ Px(x) i . From this, we obtain that for any x > 0 :
mc w ≤ (1 + o(1)). n−m c
Finally, writing m = αn, we have obtained
(xf (x)u g(x)m−u f (x)u g(x)m−u = . (14) v x xw
def w . The point of this choice We choose x = w = m(d−2)(d−1) is that it is straightforward to check that the infimum over all positive x is attained at the single positive point for which buv ≤
dmin α n1−1/c (1 + o(1)) ≤ 2 (1 − α)1/c yielding point (ii) of Theorem 1. Remark: The same argument stays valid, with practically no modification, if Cm is replaced by any m × m matrix of constant row and column weights, both equal to 2. With a little more work the argument can be made to give the same asymptotic upper bound on dmin in the more general case when Cm is just a m×m matrix with constant column weight equal to 2. R EFERENCES [1] C. Di, T. Richardson, and R. Urbanke. Weight distribution of low-density parity-check codes. to appear in the Trans. Info. Theory, 2004. [2] W. Feller. An introduction to probability theory and its applications, volume 1. John Wiley and Sons Inc., New York, 3rd edition, 1968.
uf (x)g(x) + (m − u)xg (x)f (x) = wf (x)g(x). Then it can be noted that for the range of parameters we are interested in, the x which satisfies this equation goes to zero with m. By performing a series expansion of uf (x)f (x) + (m − u)xg (x)f (x) − wf (x)g(x) around 0 we obtained the approximate solution x ≈ w . Plugging the aforementioned expression for x into Equation (14) and taking the logarithm yields
1552
ln buv ≤ u ln f (w ) + (m − u) ln g(w ) − w ln w
ISIT 2006, Seattle, USA, July 9 - 14, 2006
d−2 d−3 (2i+1 def (x) ) 2i Notice now that h(x) = fd−2 = 1 + i=12 d−2 x satisfies for any x > 0, h(x) ≤ g(x). Therefore we obtain that
≤ u ln(d − 2) + m ln g(w ) − w ln(w ). d−2 d−2 2i By noticing that g(x) = 1 + i=12 and using that 2i x for all y > 0, ln(1 + y) ≤ y, we obtain that ln buv
d−2 2
ln buv
≤ u ln(d − 2) + m
i=1
d−2 (w )2i − w ln w 2i
where the term E consists of def (d
E=
− 2)j ej e2k−2 (d − 2)2k ((d − 2)(d − 1)) (d − 2)cj ecj
m
i=1
E ≤ (2c/2 e(d − 2))j .
M = mmj tk mk−1 m
d−2 2 wi d−2 d−2 2i 2i (w ) = i−1 (d − 1)i (d − 2)i 2i m i=1
i
≤ (d − 2) e
e
m(d − 2)(d − 1) cj − 2k
cj−2k 2
From this, by putting all these inequalities together and by gathering similar terms, we obtain that there exists some constant K such that aj,k (t) ≤ K.E.M.P.R
(16)
(cj)cj j 1/2
def
P=
The term b2k cj is upper-bounded by using Lemma 3 as (cj−2k)2 m−cj+2k
m−cj .
The term P consists of powers of j and k and is equal to
Proof of Lemma 4: We bound the binomial coefficients appearing in (6) by using Fact 1 as follows
(n − m)j ej n−m ≤K j j j+1/2
j (d − 2)em −1/2 = Kj cj
t t−1 k−1 i−j−1 ie ≤ k−1 k−1 i=1 i=j+k
k−1 t−1 e ≤ ik−1 k−1 i=1
k−1 t e ≤ uk−1 du k−1 0 tk ek−1 ≤ k(k − 1)k−1
k−1 (m − i + j − 1)e m−i+j−1 ≤ k−1 k−1
k−1 me ≤ k−1
−1
cj (cj)2 (d − 2)m cj ≤ K cje (d−2)m−cj cj (d − 2)me
2k cj−2k
cj−2k 2
M = m(1−c/2)j tk .
This yields the bound given in Lemma 3.
b2k cj
(15)
This term simplifies to
w i−1 m
i=2 1 ≤w 1+ m w −1 ≤w+
.
The main term which is denoted by M is equal to def
d−2 2
ecj−2k
Note that after simplification, this term turns out to be equal cj−2k 2 , which can be bounded by to E = (d − 2)j ej−2 d−1 d−2
Note that d−2 2
cj−2k 2
kj 1/2 (cj)j k(k − 1)k−1 (k − 1)k−1 (cj − 2k)
cj−2k 2
.
Performing some simplification, we obtain
2(k−1) k (cj)(c−1)j . P = cj−2k k−1 k 2k (cj − 2k) 2 Note that
2(k−1)
2(k−1) k 1 = 1+ k−1 k−1 = e2(k−1) ln(1+1/(k−1)) ≤ e2 . Note also that k 2k (cj − 2k)
cj−2k 2
= (cj)cj/2+k
k cj
2k 1−
2k cj
cj−2k 2
⎡ ⎤cj 2k
1−2cj 2k cj cj 2k ⎢ k ⎥ 1− = (cj) 2 +k ⎣ ⎦ cj cj Noticing that 2k cj is in the range [0, 1] and that the function 1−x x → (x/2)x (1 − x) 2 is easily shown to be always greater than 2/5 in the range [0, 1], we deduce that there exists some constant K such that P ≤ K(5/2)cj (cj)(c/2−1)j−k . The remaining term R is defined by def
(cj−2k)2
(cj)2
R = e m−cj+2k e m−cj . Note that there exists some constant K such that R ≤ 2 K(e2c j/m )j . This term is clearly bounded by some constant K to the j-th power. Putting all these facts together, we obtain that there exists some constant K such that ajk (t) ≤ K j m(1−c/2)j tk (cj)(c/2−1)j−k . We obtain Lemma 4 by substituting for t the upper-bound αm1−2/c and ∆ for cj/2 − k.
1553