I N this paper we investigate the distribution of binary de Bruijn sequences of given linear complexity. The linear complexity C(S) of a sequence S is one of the.
IEEE TRANSACTIONS
ON
INFORMATION
THEORY,
VOL.
611
IT-30, NO. 4, JULY 1984
515-534,1982. [27] A. Shamir, “The strongest knapsack-based cryptosystem?” presented at Crvnto ‘82. Santa Barbara, CA, August 1982. [28] Y. Desmedt,*j. Vandewalle, and R. Govaerts,“A general public key cryptographic knapsack algorithm based on linear algebra,” in Proc. IEEE Int. Symp. Inform. Theory, Abstract of papers, St. Jovite, Quebec, 1983, Sept. 26-30,1983, pp. 129-130. [29] E. F. Brickell, “A new knapsack based cryptosystem,” Internal
Rep., Sandia National Laboratories, Albuquerque, NM. [3O] J. C. Lagarias, “Knapsack-type public key cryptosystems and diophantine approximation,” extend abstract in Advances in Ctyptology, Proc. Ctypto’83, D. Chaum, Ed. New York: Plenum, pp. 3-24. [311 E. F. Brickell, “Solving low density knapsacks in polynomial time,” in Proc. IEEE Int. Symp. Inform. Theory, Abstract of papers, St.
Jovite, PQ, Canada, Sept. 26-30,1983, p. 130.
On the Distribution of de Bruijn Sequences of G iven Complexity TUVI ETZION AND ABRAHAM LEMPEL, FELLOW, IEEE
Abstract-The distribution y( c, n) of de Bruijn sequences of order n and linear complexity c is investigated. It is shown that for n > 4, ~(2” - 1, n) = 0 (mod@, and for k > 3, ~(2’~ - 1,2/c) = 0 (mod16). It is also shown that y(c, n) = 0 (mod4) for all c, and n 3 3 such that cn is even.
appears exactly once as a state of S. The set of all de Bruijn sequencesof order n will be denoted by DS(n). The complement CS and the reverserS of a string S = so, Sl,’ * *3sk-i are defined by CS = So,S,; . a, Sk-i, where Sois the binary complement of si, and rS = skpl; * *, sl, so. Note that the operators c and r commute. I. INTRODUCTION S is called a CR-sequence if CS = rS, or equivalently N this paper we investigate the distribution of binary crS = S. de Bruijn sequences of given linear complexity. The Every sequence S = [so, sl,. . . , sk- J, satisfies a linear linear complexity C(S) of a sequence S is one of the recursion of degree m < k, measuresof its predictability-S is completely determined m by 2C(S) consecutive bits. Although high complexity does i > 0. + C UjSi+m-j = 09 Si+m not necessarily m e a n low predictability, the converse is j=l always true: low complexity implies high predictability. In In terms of a shift operator E, defined by many applications it is therefore important to know the linear complexity. Es, = s~+~, In the sequel we need the following definitions and the linear recursion takes the form notation. Let si, s2,. . . , denote a string of binary digits. A cyclic, Em + E ajE”-j si = 0, i 2 0. or closed, string is called a sequenceand is denoted by j=l s = [So,S1,“‘, sk- J, where k = l(S) is the length of S. The order of a sequenceS = [so, sl,. . . , s~-~], is the least Let f(E)si = 0, i 2 0, be the linear recursion of least integer n such that the n-tuples, y = (si, si+i; . ., si+,,-i), degree satisfied by S. Then the complexity C(S) of S is 0 6 i 6 k - 1, with subscripts taken m o d u lo k, are all defined as the degreeof f(E) viewed as a polynomial in E. distinct. Such sequencescan be viewed as k-cycles from a For later reference, we state the following known facts [l], feedback shift register of n-stages,where the n-tuples 5 are PI: successivestates of the register (or of the sequence). Fuct 1: S E DS(n) implies CS E DS(n) and rS E Two sequences S, and S, are said to be equivalent, DS(n). If S is a CR-sequence,so are cS and rS. S, = S,, if one is a cyclic shift of the other. Fact 2: If S is a sequencewhose length is a power of 2 A sequence S of length 2” and order n is called a de then C(S) = c if and only if (E + l)‘-‘si = 1, i > 0. Bruijn sequence. Note that each of the possible 2” n-tuples Fact 3: Let y(c, n) denote the number of de Bruijn sequencesof order n and complexity c. Then, for n b 3 Manuscript received February 24,1983; revised December 19,1983. y( c, n) = 0 (mod 2), and for all n = 2k > 4, y(c, n) = 0 The authors are with the Department of Computer Science, Technion, (mod 4). Haifa, Israel.
I
O O lS-9448/84/0700-0611$01,00 01984 IEEE
612
IEEE TRANSACTIONS
ON
INFORMATION
THEORY,
VOL.
IT-30, NO. 4, JULY 1984
The last result is obtained by considering, along with and S E DS(n), the sequences cS, rS, and crS, which are u&s = zus = [0011011110000101]. pairwise inequivalent de Bruijn sequences of the same Lemma 2: If S E DS( n) and C(S) = 2” - 1, then complexity. This technique does not work for odd n in C(zS) = C(uZ) = C(zuS) = 2” - 1. which case S could be a CR-sequence. In Section II we investigate the value of y(c, n) for Proof: It can be easily verified that sp( S) = sp( zS) c = 2” - 1, and we prove that for n > 4, ~(2” - 1, n) = 0 = sp( US) = sp(zuS). This and Lemma 1 imply Lemma 2. (mod 8), and for k 2 3, ~(2~“ - 1,2k) = 0 (mod 16). Chan Q.E.D. et al. [2] conjectured that y(c, n) = 0 (mod4). In Section The following lemma characterizes sequences S of even III we prove this conjecture for all c and n > 3 such that length, for which S = rS. cn is even. Lemma 3: If S is a sequenceof even length, and S = rS, then S takes one of the following forms: II. ON THE VALUE OF ~(2” - 1, n) a) S = [ XrX]. It is well known [l] that the maximal complexity of de b) S = [b,Xb,rX], b,, b, E {O,l>. Bruijn sequences of length 2” is 2” - 1. In this section we prove that ~(2” - 1, n) = 0 (mod 8) for n > 4, and that Proof: Suppose l(S) = m. We can write S = y(22k - 1,2k) = 0 (mod16) for k 2 3. 1Sl, S2,’ - -9s,]and rS = [sm;.-, s2, sl]. Since S = rS there First, we derive a characterization of all sequences of exists an integer k such that length 2” and complexity 2” - 1. To this end, we need the = rS. 1Sl, 9.,’ * *>sm] = [sk,---,s2,s1,s,,--.,sk+l] following definitions. The weight W(S) of a sequence S is the number of Let Y, =sl;..,sk and Y2=sk+l,...,s,. Then S= ONES in S. [YiY,], Y1 = rY,, and Y2 = rY2. We distinguish between For sequences of length 2” and even weight, we define the following two cases. the subpurity, sp( S), of S as the parity of the number of Case 1: k is even. Here both k and m - k are even and ONES in the even (or odd) positions of S, i.e., we can write Yi = X,X,, Y, = X,X,, and S = [YiY,] = [X,X,X,X,] = [X,X,X,X,], where 1(X,) = 1(X2) and sp(s) = so + s2 + s‘j + .** ++I-, 1(X,) = 1(X,). Since Y, = rY, and I( X,) = 1(X2), we have = Sl + sg + sg + *** +s,n-1. X, = rX,. Since Y, = rY2 and 1(X,) = 1(X4), we have Lemma I: For a sequence of length 2”, C(S) =2”-1 X, = rX,. Hence letting X = X,X,, we obtain rX = if and only if sp(S) = 1. (rX,rX,) = X,X,, which implies a). Case 2: k is odd. Here both k and m - k are odd, and Proof: By Fact 2 C(S) = c if and only if (E + l)“-‘si we can write = 1 for each i. Now,
(E +
1)2”-2
= cE + ‘1”” _ “,‘“=,’ E : 1 (E + 1)” 9”-’
+
,552
+
. . . +E
+
1
p-4
xls(k+1)/2x2,
y2
=
X3S(m+k+l),2X4,
s = [Y,Y,] = [xs1
E+l +
=
and
=
= p-2
r,
+
. . .
+,p
+
1.
z
(k+1)/2
xxs 2
3
(m+k+1),‘2
xl 4
[s(k+1),2x2x3s(m+k+1),2x4x~]~
where 1(X,) = I( X2) and 1(X3) = 1(X,). As in Case 1, we obtain X, = rX2 and X3 = rX,. Hence letting X = X2 X3, b, = s@+1)/2,and b, = sc,+k+1)/2, we obtain b). Q.E.D.
Hence, C(S) = 2” - 1 if and only if for each i 1 = (E + 1)2”-2si = si+2”L2 + si+2”L4 + *. . +si+2 + si = sp(s).
Let G, denote the group generated by the operators r, z, and u on DS(n). It is easy to verify that G, is commutative. Next, we show that the de Bruijn sequences of order n Lemma 4: G, = { e, r, z, u, rz, ru, zu, rzu }, where e is and complexity 2” - 1 can be partitioned into equivalence the identity operator. For n 2 5 and for each S E DS(n), classes of order 8 or 16. To this end, we shall need some G,S _CDS(n) consists of eight pairwise inequivalent semore definitions and lemmas. quences. Let S E DS(n) and let zS (resp. US) denote the seProof: The given representation of G, follows from quence obtained from S, by interchanging the positions of the commutativity of G, and from the fact that for each the unique runs of n and n - 2 ZEROS (resp. ONES). g E G,, g2 = e. One can readily verify that zS, US E DS(n). It is also easy to verify that each operator of G, preExample: For the de Bruijn sequence S = serves the defining property of de Bruijn sequences and, [0000111101100101],we obtain hence, G,S G DS(n). 2s = [0011110110000101]) Since each n-tuple occurs exactly once in every de Bruijn sequence S the unique runs of n zeros, n ones, n - 2 us = [0000110111100101], Q.E.D.
ETZION
AND
LEMPEL:
DISTRIBUTION
OF DE BRUIJN
TABLE I 1 nZEROS
1 nONES
Since CS = rS, there exists an integer k such that
1 n-2ZEROS
1 n-20NES
zeros, and n - 2 ones are nested in S as follows: x,lO”lx,,
x,Ol”Ox,,
~,10”-21X2,
and
X 3 01”-20X,,
where xi, x2 E (0, l}, and uk denotes a sequenceof k a’s. Table I depicts the situation in S, zS, US, and ZUS, with respect to the above runs. (Note that since n - 2 2 2, the operators z and u preserve the value of the x~‘s.) It can be seen that the four sequencesof this table are pair-wise inequivalent. Hence, the four sequencesrS, rzS, ruS, and rzuS are painvise inequivalent also. Now, let dS denote the sequenceobtained from S by deleting two ZEROS and two ONES from the unique runs of n ZEROS and II ONES, respectively. Note that each of the n-tuples (10nm21)and (01”-20), appears twice in dS and each of the other n-tuples of dS appears only once. Note further that dS is a sequenceof order n + 1. One can readily, verify that, viewed as an operator, d commutes with c and with each element of G ,. It is also easy to verify that dS = dzS = duS = dzuS and drS = drzS = druS = drzuS. Hence to complete the proof, it suffices to show that dS 3: drS. Assume dS = drS. Then, also dS = rdS, and by L e m m a 3, dS takes one of the following forms: a) dS = [XrX]. b) dS = [b,Xb,rX],
613
SEQUENCES
b,, b, E (0, I>.
In either case, one of the following n-tuples, (001”-400), (110”-411), and (010”-410), has more than half of its bits in X (or in rX). Let Y be this n-tuple. Then, rV has more than half of its bits in rX (or in X), which contradicts the fact that Y = rV and that V appears only once in dS. Q .E.D. From L e m m a 4 we infer that for n > 5, &S(n) can be partitioned into equivalence classes of order 8. This, together with L e m m a 2 and the facts that y(15,4) = 8 and C(S) = C(rS) we have the following theorem. Theorem I: For n > 4, ~(2” - 1, n) E 0 (mod8).
[s,, J,; . * ,f,]
= [~k,~k-l,“‘,~l,s~,s~-~,“’
ysk+l
1a
Let Y, = sr; 9.,sk and Y, = sk+i;. .,s,. Then S = [YiY,], cYi = rYl, cY2 = rY2 and, hence, k and m - k are even. Therefore, we can split Yr and Y2 in the m iddle to obtain Yi = X,X,, Y2 = X3X4, and S = [X,X,X,X,] = [X4 Xi X2 X3], with cX, = rX2, and cX3 = rX,. Letting X = X4 Xi, we obtain rcX = (rcX,rcX,) = X2X3, which implies S = [XrcX]. Q .E.D. L e m m a 6[2]: If S E DS(n), n 2 3, then S f cS. Lemma 7: If S E DS(2k), k > 3, then the union of G ,S and G ,cS consists of sixteen pairwise inequivalent de Bruijn sequences. Proof By lemma 4, G ,S c DS(2k) consists of eight pairwise inequivalent sequences.Clearly, the same is true for G ,cS. Also, as in the proof of L e m m a 4, we have
dS = dzS = duS = dzuS, drS = drzS = druS = drzuS, dcS = dzcS = ducS = dzucS, and drcS = drzcS = drucS = drzucS. Furthermore, the inequivalence dS * drS, from the proof of L e m m a 4, implies dcS 3: drcS. To complete the proof, it suffices to show that dcS 3: drS and dcS 3: dS, since these inequivalences imply drcS r dS and drcS * drS, respectively. a) Assume dcS = drS. Then cdS = rdS, and by L e m m a 5, dS = [XrcX], for some X. Once again, one of the three n-tuples, (Oklk), (lkOk), and (lkP1OIOk-‘), n = 2k, has more than half of its bits in X (or in rcX). Let V be this n-tuple. It follows that rcV has more than half of its bits in rcX (or in X), which contradicts the fact that V = rcV and that V appears only once in S. Hence dcS 3: drS. b) Assume dcS = dS. dS has two runs of n - 2 ONES and two runs of n - 2 ZEROS. Hence dS takes one of the following forms: 1) dS = [O’-2X,0n-2X21’-2X31’-2X4]. 2) dS = [O’-2X,1’-2X20n-2X31n-2X4]. If 1) holds, then dcS = [1”-2cX,1”-2cX20”-2cX30”-2cX4] = dS. Hence, cX, = X3 and cX2 = X4. This implies that s, = [o~-2x,o~x21~-2x31~x4]
The next lemma presents a characterization of CRsequences,i.e., sequencesS such that CS = rS. Lemma 5: A sequenceS is a CR-sequenceif and only if I(S) is even and S = [ XrcX], for some X.
is a de Bruijn sequence satisfying S, = c&, which contradicts L e m m a 6. If 2) holds, then
Proof: If I(S) is even and S = [ XrcX], then CS 2: [cXrX] = rS. Now, let S = [s1s2; . ., s,] be a CR-sequenceand suppose I(S) is odd. Then I(S) - W(S) # W(S), which implies W(cS) + W(rS), or CS t rS. Therefore, I(S) must be even.
Comparing this with the form of dS in 2), we obtain X2 = cX,, X3 = cX,. This implies X1 = X3, which means that S contains two identical strings, O ”-2X,1”-2 and 0”-2X31n-2 of length 2 II, contradicting the de Bruijn property of S. Hence dcS * dS. Q .E.D.
dcS = [1”-2cX~O”-2cX21”-2cX30”-2cX4].
614
IEEE TRANSACTIONS
Let G, denote the group generated by the operators c, r, z, and u on DS(n). It is easy to verify that, except for the pairs (c, z) and (c, u), any two of these four generators commute; the exception pairs satisfy cz = UC and cu = zc. This and Lemma 7 imply the following result. Lemma 8: G, is the union of G, and G,c. For k > 3 and S E DS(2k), G,S _CDS(2k) consists of sixteen pairwise inequivalence sequences. From Lemma 8 we infer that for k > 3, DS(2k) can be partitioned into equivalence classes of order 16. From this, Lemma 2, and the fact that C(S) = C(cS) = C(rS), we obtain the following theorem. Theorem 2: For k > 3, ~(2~~ - 1,2k) = 0 (mod 16).
III.
ONTHEVALUEOF
y(2k,n)
ON INFORMATION
THEORY,
VOL.
IT-30, NO. 4, JULY 1984
b) l’Q1+ b-Q1= W V .
c) If Q = rQ, then [Q] = [ ZrZ]. Proof: Let [Q] = [QrQ2], where I(Q,) = l(Q2). Then
we obtain
a) LrcQl = [rcQ2rcQll and [Ql + LrcQl = [Q, + rcQ2Q2+ rcQJ = [cQI + rQ&Ql + rQ2>l. b) PQI = [rQ2rQll ad IQ1+ b-Q1= [QI + rQ2Q2+ rQJ = [Q, + rQ2r(QI + rQ2>l. c> Let Q = ql,q2,-.,q2m and Z= ql,q2,-,qm . If Q = rQ, then qi = q2m-i+l, [ ZrZ] as claimed.
1 < i < m, and [Q] = Q.E.D.
Lemma 11: Let S E DS(n), n > 3, be a CR-sequence. Then application of the Games and Chan algorithm to S yields A, = [ll].
Games and Chan [3] derived an algorithm for computing Proof: By Lemma 5, S = [QrcQ] for some Q. Since the complexity C(S) of a sequence S of length 2”. From S E DS(n), it is clear that Q # rcQ. Applying the Games this algorithm we derive a method of distinguishing be- and Chan algorithm to A, = [QrcQ], we obtain A,-, = [Q tween sequencesof even and odd complexity. + rcQ]. By Lemma lOa), we can write A,-, = [ XrX]. By The input to the Games and Chan algorithm is a se- parts b) and c) of Lemma 10, A, = [YmrYm],for 1 < m < quence S of length I(S) = 2”. If S # 02”, the complexity c n - 1. Since S is a nonzero sequence, we have A, = [ll]. of S is computed recursively as follows. Initially, set c, = 0 Q.E.D. and A, = S. At a typical step of the algorithm the left half of A,, L(A,) = [a,;.., u~~-~-J, is added to the right The following is an immediate corollary of Lemmas 9 half, R(A,) = [u~~-I; . ., u~~-J, the result being a seand 11. quence B,, of length 2”-‘. If B,,, = 02”-1, A, is replaced Corollary 1: If S E DS( n) is a CR-sequence, then C(S) by A,-, = L(A,), and the complexity is left unchanged, is odd. i.e., c,-r = c,. If B, # O”‘-l, A, is replaced by&, = The absence of de Bruijn CR-sequencesof even complexB,, and c, is replaced by c,-i = c, + 2*-r. The comity makes it possible now to extend Fact 3 to the following plexity of S is given by C(S) = c0 + 1. broader result. Lemma 9: A nonzero sequence S of length 2” has odd complexity if and only if application of the Games and Theorem 3: y(c, n) = 0 (mod4) for all c and n such Chan algorithm to S yields A, = [ll]. that cn is even. Proof: Since c, = 0, and c,-r - c, is even for all m > 2, it follows that cr is even. For C(S) to be even, c0 must be odd, which happens only if L(A,) # R(A,). Hence
if A, = [ll], C(S) is odd. One can easily verify that if S # 02’, then A, # Ozmfor each m and, thus, if C(S) is Q.E.D. odd, A, = [ll]. Lemma IO: Let Q be a string of even length. Then,
a> IQ1+ b@l = iXr-0
REFERENCES [l]
H. Fredriksen, “A survey of full length nonlinear shift register cycle algorithm.” SZAM Rev.. vol. 24. DD. 195-221. Am. 1982. [2] A-H. Ch& R. A. Gamks, and E.-i. Key, “On tie complexities of de Bruijn sequences,” J. Combin. Theory, Ser. A, vol. 33, pp. 233-246, Nov. 1982. [3] R. A. Games and A. H. Chan, “A fast algorithm for determining the complexity of a binary sequence with a period 2”,” IEEE Trans. Inform. Theoty, vol. IT-29, pp. 144-146, Jan. 1983.