magnetic recorders [5] or digital optical recorders [6]. For this class of codes, d denotes the minimum runlength of consecutive ... tions is sometimes called the density ratio, DR, where DR 2. (d + l>m ... University, P.O. Box 524, Johannesburg, 2000, South Africa. ...... fact for the (1,2) constrained channel with C(1,2) = 0.4057,.
IEEE
TRANSACTIONS
ON INFORMATION
THEORY,
VOL.
37, NO. 5, SEPTEMBER
1399
1991
Correspondence Error and Erasure Control (d, k) Block Codes H. C. Ferreira and S. Lin
Abstract -New combinatorial and algebraic techniques are presented for systematically constructing different (d,k) block codes capable of detecting and correcting single bit-errors, single-peak shift-errors, double adjacent-errors and multiple adjacent erasures. Constructions utilizing channel side information, such as the magnetic recording ternary channel output string, or erasures, do not impose any restriction on the /c-constraint, while some of the other constructions require k = 2d. Due to the small and fixed number of redundant bits, the rates of both classes of constructions can be made to approach the capacity of the d constrained channel for long codeword lengths. All the codes can be encoded and decoded with simple, structured logic circuits. Index Terms -Cd, k) constraints, error-correction, erasure correction.
I.
recording
codes, combined
codes,
INTRODUCTION
In this correspondence, we investigate the binary (d, k) codes that find application on input restricted channels, such as digital magnetic recorders [5] or digital optical recorders [6]. For this class of codes, d denotes the minimum runlength of consecutive O ’s between successive l’s in the code sequences while k denotes the maximum runlength of such 0’s. The parameter k pertains to the extraction of a clock signal from the code sequences while the parameter d pertains to the control of intersymbol interference on bandwidth limited channels [2]. The relationship between code parameters and the physics of recording systems, are summarized e.g., in [5]. During the NRZI (nonreturn zero inverse) preceding step that is usually applied to (d, k) recording codes, the l’s in the (d, k) code sequences are mapped onto transitions in the corresponding binary channel waveform with runlength constraints d + 1 and k + 1 similar symbols. When using a recording code that maps m information bits onto n code bits, the minimum separation between transitions is sometimes called the density ratio, DR, where DR 2 (d + l>m /n data bit intervals. It is often attempted to maximize the density ratio, since there is usually a minimum separation between transitions that a bandwidth limited recording system can support, and this minimum separation poses a fundamental Manuscript received July 6, 1989; revised March 30, 1991. This work was supported in part by the S. A. Foundation for Research Development, by the S. A. Department of Posts and Telecommunications, and by U.S. NSF Grant NCR-8813480. This work was presented in part at the 22nd Annual Princeton Conference on Information Sciences and Svstems. Princeton. NJ. March 16-18. 1988. and at the IEEE Internatibnal Symposium ‘on hrformation Theory; San Diego, CA, January 14-19, 1990. This work was accepted for the Special Issue on Coding for Storage Devices, in this TRANSACTIONS, May 1991. H. C. Ferreira is with the Laboratory for Cybernetics, Rand Afrikaans University, P.O. Box 524, Johannesburg, 2000, South Africa. S. Lin is with the Department of Electrical Engineering, University of Hawaii at Manoa, Holmes Hall 483, 2540 Dole Street, Honolulu, HI 96822. IEEE Log Number 9101400.
limitation on the recording density. Another important parameter in recording systems, is the width of the detection window, during which the presence or absence of a transition has to be detected. This detection window is of width m/n data bit intervals and a large detection window is preferred due to possible bit synchronization imperfections during the read process. In most current recording systems, an outer errorcorrecting/detecting code and an inner (d, k) code are concatenated if it is desirable to effect both error control and compliance with the channel’s (d, k) input restrictions. It should be noted that the overall rate of the concatenated coding schemes is maximized in order to maximize the recording system’s storage capabilities. Due to the low redundancy of the error-control codes that are currently used in recording systems [lo], this overall rate is determined primarily by the rate of the (d, k) code. Recently, coding techniques capable of both error-correction and compliance with the input restrictions of recording media, have received some attention [ll]-[20]. Specifically the “combined” codes in [13]-[17] and [20] are (d, k) constrained codes with minimum Hamming distance amin 2 3. The feasibility of error-correcting (d, k) block codes was proven in [4]. Enumeration procedures and numerical results for a few specific (d, k) constraints and short codeword lengths (n I 20) were presented in [Vi]. Computationally intensive computer search techniques for establishing upper and lower bounds on the cardinality of codeword sets and numerical results involving short and medium codeword lengths (n < 41) were presented in [20]. Encoding and decoding procedures for the block codes in [15] and [20] received little consideration. The emphasis in this correspondence is thus on systematic, analytical code construction procedures that work for any d constraint or codeword length (including long codes) and which furthermore yield codes which can be encoded and decoded with relatively simple, structured logic circuits. The “combined” block codes that we present here are also efficient in terms of achievable code rates. The rates of some of our constructions asymptotically approach the capacity of the (d, k) constrained channel for all values of k > d 2 1, while the rates of other constructions approach the capacity of the (d,2d) constrained channel, and hence the capacity of the d constrained channel for large d. II.
PRELIMINARIES
Shannon [l] first investigated the discrete noiseless channel with input restrictions, using recursive relations to determine the number of suitably constrained sequences of a given length, which satisfy the channel’s input restrictions. He also defined the asymptotic information rate or capacity of the channel. Tang and Bahl [2] applied these procedures to q-nary, 4 2 2, (d, k) limited sequences. From [2] follows that the characteristic equation of the binary (d, k) constrained channel is given by Zk+~-Zk-d-Zk-d-~. . . -z-1=o (1) The (d, k) input restricted channel’s capacity, C(d, k), in infor-
0018-9448/91/0900-1399$01.00 01991 IEEE
IEEE
1400
TRANSACTIONS
1 C(d,k)g/imTlog,N(l)=log,h,
compositions 3, 21, 12, 111.
s = s,s,s,s4s5s~s~=1010010. & =2 3 2. @=41 42
We thus let the binary symbol 1 in S denote the beginning of each part in a. Note that a composition block with runlength constraints (d, k), is restricted to w = k - d + 1 distinct parts xi, where i=1,2;.
.k-d+l.
= (tdf’+
td+*+
... + tk+*j.
(4)
Next, we note that r ranges between [l/( k + 1)1 and [l/(d + l)]. Finally, we sum over all c,(t) to obtain c(t): Ll/(d+l)l
c(t)=
c
(td+‘+td+*+
... +tk+‘j.
(5)
r=[l/(k+l)l
By imposing certain restrictions on the (d, k) compact generating functions can be obtained of all (d, k) sequences. For example, Forsberg derived a generating function for enumerating
sequences, more for such a subset and Blake [7], [8] (d, k) sequences
iFo
(7).
L
(7)
In the next section, we show how to impose restrictions on the composition representations of (d, k) sequences in order to purge all sequences at Hamming distance S = 1 from a retained codeword. III.
CONSTRUCTIONS FOR SINGLE ERROR-DETECTION WITHOUT CHANNEL SIDE INFORMATION
We first investigate code constructions capable of single error detection without the aid of channel side information. In later sections we augment some of these constructions with a small number of parity bits in order to effect error-correction. We confine our investigation to block codes where each codeword can be represented with one of the following two equivalent binary string representations: rl = sls2..
(3)
Note also that the first d + 1 bits of the composition block i.e., 10d (in standard exponential notation) can be assumed to be implicit and need not be transmitted over the channel. Enumerating generating functions-are often used in combinatorial theory to evaluate the number of restricted partitions or compositions of integers. Hence, these functions represent an alternative to the frequently used recursive relations to determine the number of (d, k) sequences of a given length 1. For example, the number of compositions of 1 with parts ranging between d + 1 and k + 1, and hence the number of (d, k) sequences of length 1, can be determined by evaluating the coefficient of t’ in the corresponding enumerating generating function, c(t). In order to determine c(t), we first set up an enumerating generating function, c,(t), for compositions with exactly r parts, where the parts range between d + 1 and k + 1 [3, p. 1241: c,(t)
1991
We now consider the composition representations of (d, k) sequences in the context of a lower bound on the minimum Hamming distance, smin, achievable with a (d, k) constrained code. Several such bounds applying to both runlength constrained and dc-free block codes, were set up by using Gilbert type arguments and published in [4]. Briefly, a (d, k) constrained code with amin 2 3, mapping m information bits onto n code bits, can be obtained by retaining 2” codewords from the full set of sequences satisfying the (d, k) constraints. The latter set has cardinal&y N(n). All sequences at i < Z&,, from a retained codeword should be purged. Hence, wn)22m
x,Ad+i,
37, NO. 5, SEPTEMBER
a,,-1
We henceforth only consider (d, k) sequences starting with a 1 and terminating with at least d 0’s. We shall refer to these (d, k) sequences as composition blocks. It follows that a unique composition of the integer 1 can be associated with each such (d, k) sequence of length 1 bits. For example, a (d, k)= (1,2) sequence of length I= 7 bits, can be represented with strings as follows. binary representation: composition representation:
VOL.
t(1- t) c(t) = l-t-td+l+tk+*’
(2)
where A is the largest real root of the above characteristic equation. We now show that an integer composition can be associated with a (d, k) sequence. A partition of an integer 1 is by definition a collection of smaller integers, or parts, which sum to 1, without regard to order. The corresponding ordered collection of parts, which we shall denote with the string Q ’, is called a composition (see e.g., [3]). Consider for example the unrestricted partitions and compositions of the integer 3: partitions 3, 21, 111
THEORY,
both beginning and ending with a 1:
mation bits per binary channel symbol is thus
1 3
ON INFORMATION
. slplb;bf..
. b;‘p,b;
. . b2”. . . pub,’ . . . b,d, (8)
or r, = s1s* . . . slp,p, . . p,. (9) The bits si, 1 I i I 1, in (8) and (9) correspond to the composition block introduced in the previous section. At the encoder information bits are mapped directly onto these bits. Each composition block has a fixed number of bits 1 and a variable number of parts r, i.e., @ = &42. . .4,, where [l/( k + 1)1 I r I [l/(d + l)]. The bits pi and b! in (8) as well as the pi in (9) are redundant bits introduced for error and erasure control. Each group of bits pib,f . . . bf, 1 I i I u, in Ii shall be referred to as a redundancy subblock. The pi represent parity bits and the b,!, 1 I j I d are buffer bits. The buffer bits are initially set to 0, in order to satisfy the d constraint between pi and pi+I or between p, and the si = 1 of the next codeword. If pi = pi+I = 0, one of the 6; may be inverted to 1 in order to reduce the runlength of consecutive 0’s. However as we shall see later, the k constraint within the redundant subblocks may occasionally exceed the k constraint within the composition block. The I1 representation is thus a systematic codeword with respect to the parity bits which can be read directly from it. The I’, representation is a nonsystematic codeword with respect to the parity bits that can only be obtained from the redundant bits by means of a lookup table. The redundant bits pi, 1 I i I w in (9), now represent a (d, k) sequence which is catenatable to the s,, such that both the d and k constraints within the s, is preserved everywhere in I,, and such that N(0) 2 2”. Error patterns that violate either the d or k constraint within an arbitrary (d, k) sequence, can always be detected. However,
IEEE
TRANSACTIONS
ON INFORMATION
THEORY,
VOL.
these error patterns do not include all single errors. ing one redundant subblock to the composition arbitrary (d, k) constraints in (8) and by setting pr modulo 2 sum of the bits in the composition block,
37, NO. 5, SEPTEMBER
By appendblock with equal to the i.e.,
pi = f: si modulo 2 = r modulo 2, all single errors can be detected. Note that single errors that do not violate the (d, k) constraint, change the number of parts in the composition block with one unit. By appending two redundancy subblocks, and setting p1 and pz equal to the binary representation of the modulo 4 sum of the number of parts, the nature of an error within the composition block, i.e., a drop in or out of a 1, corresponding to either the increase or decrease of the number of parts within the composition block, can furthermore be detected. When using the construction in (10) for a specific d constraint, k and 1 can be made arbitrarily large and code rates can approach the capacity of the d constrained channel. Another approach can also be used to achieve single errordetection. Due to restrictions imposed on the (d, k) sequences, the code rates achievable when using this construction for a specific d constraint, are lower than the rates achievable with (lo), but this approach can then be used in further constructions capable of error-correction. This approach essentially involves imposing restrictions on the composition rules applying to the composition block. Specifically, we propose restricting the composition block such that it can include either two consecutive parts a and b (in any order) or their arithmetic sum c = a + b, but not both. We shall refer to this rule as the basic composition rule for error-detection. It now follows that the transition of a 0 into a 1 will always partition a larger part c into two smaller parts a and b, which violates the composition rule, and it may furthermore also violate the d constraint. Similarly, the transition of a 1 into a 0 will always merge two smaller parts a and b into one larger part c, which violates the composition rule, and it may furthermore also violate the k constraint. Consequently, not only can all single errors always be detected, but the approximate location of such errors can also be determined. In Section IV we shall construct error-correcting codes, using sequences complying with the basic composition rule. There are three families of composition blocks satisfying the basic composition rule. Recursive relations such as in [l] and [2] can be used for enumeration and to obtain the characteristic equation for each family. While enumeration is sometimes straightforward, this is not always the case. For completeness we thus present all the relevant recursive equations. As alternative, the procedures in [8] can also be used to set up the generating function for each family, which we also present here. Family I: The idea is to build a continuous set of integers X, such that (a + b) P X for all a, b E X. This family of composition blocks is thus characterized by the fact that all possible runlengths of O ’s between d and k can occur. Furthermore, any runlength may follow any other runlength. If we use
it follows that the basic composition Note that d = x1 - 1 and k = x, amples, the sets X = {2,3}, {3,4,5), for code constructions. This is the where the transition of a 0 into constraint, and the transition of a 1 constraint.
1401
The following recursions can be used for the enumeration composition blocks in Family I:
of
N(l)
= 0,
for I< xi,
(12)
N(l)
= 1,
forxrlllx,,
(13).
and
i=l
X~{xi/x,~xi~x,=2x,-1},
1991
(11) rule is satisfied. - 1. Hence, k = 2d. As ex{4,.5,6,7}, etc., may be used only family that we present, a 1 always violates the d into a 0 always violates the k
N(l) = 2 N(l- Xi),
for l> x,.
(14)
i=l
Alternatively, a generating function can be set up. We use the generating function in (6) and note that the (d, k) sequences both beginning and ending with a 1, need to be extended with between d and k = 2d zeros to obtain a Family I composition block. Consequently the overall generating function [8] will be
c(t) =
t(1-
t)
d+l+
...
~~t-td+l+tk+2’(td+t
+
t”)
t( td - tzd+l) =
lwt
_
+
t2d+2
x1-l - plv>
t(t = l-
td+l
t - tX1+ p,+l
(15)
.
The characteristic equation for Family I follows if we substitute N(1) with z’ in (14): z*w - z%-xl-
z%-xz - . . . -I=
0.
(16)
Family ZZ: Here the idea is to build a discontinuous set of integers X, such that (a + b) G X for all a, b E X. This family of composition blocks is thus characterized by the absence of some runlengths of O ’s between d and k. Similar to Family I, there is no restriction on consecutive runlengths. We start with a set of parts from the first family and extend it while still complying with the basic composition rule. For example, the sets {2,3,7,8} or {3,4,5,11,12,13,19,20,21} may be used for code constructions. We now propose constructing the set X as the union of (Y continuous subsets Xi, where XjP{x:/x{sx{sx~,},
llj x;,.
(23)
Alternatively, we can adapt (5) or (6) (using similar procedures as in [8] to adapt (6)) to obtain the generating functions in (24) and (25), respectively, c(t)
= .1’y:),(
cx:)’ i,i
t. Cp-1 i,i = I- Et”’ i.j The characteristic
(24)
VOL.
Family II
9 -yz”Em-
c
(d,k)
c
(4 k)
(1,2) (2,4) (3,6) (4,8) (5,lO) (6,121 (7,14) (8,16) (9,18)
0.4057 0.4057 0.3746 0.3432 0.3158 0.2924 0.2724 0.2552 0.2402
(1,121 (2,201 (3,281 (4,361 (5,44) (6,521 (7,60) (8,681 (9,76)
0.5131 0.4400 0.3893 0.3507 0.3201 0.2950 0.2741 0.2564 0.2411
(1,4) (2,6) (3,7) (4,9) (5,111 (6,13) (7,lS) (8,17) (9,191
{(2,2), (2,3), (3,211 0.4774 {(3,3), (3,4), (4,3)} 0.4254 ((4,4)) 0.3833 ((5,5)) 0.3478 ((6,6)) 0.3185 ((7,7)} 0.2941 ((8,8)} 0.2736 ((9,911 0.2560 {(lO,lO)} 0.2408
. : + . . X .
c
: : : : :
FAMILY I FAMILY II FAMILY III d CONSTRAINED CONSTRUCTION I CONSTRUCTION II
w, x; - . . . - c zxEa - xLF= 0, i=l
i=l
i=l
Y
C&k)
equation follows from (23):
WI zxzm;a =yz”Em-xf-
1991
Family III
(25)
.
37, NO. 5, SEPTEMBER
TABLE‘1 I, II, AND III: CAPACITIES OF COMPOSITION BLOCKS
Family I
+ &(I-xy),
i=l
i=l
THEORY,
(26)
Family ZZZ: We now use any continuous set of integers X with a, b,c E X, c =(a + b), but we delete composition blocks in which a and b occur consecutively. Hence all possible runlengths of O ’s between d and k can occur within the composition block. However, some runlengths may not directly follow other runlengths. We use the continuous set X = {xi /x1 5 xi I x, 2 2x,) to denote all parts which may occur in the composition block but need to introduce the set Y to denote all ordered pairs of parts that are not allowed to occur consecutively: Yp{(Y,,Yj)/Yi+Yj=xk,
X12 Yi,Yj?Xk IX,}.
N(1) = 0,
for I< x1,
(28)
N(l)
forxrlllx,,
(29)
=
z N(l-xi)+rw~x’ Xl xi = x, - x1
z
Channel capacities.
From (30) follows the characteristic z2x,-x,
5
(30)
It is now much more difficult to set up a general generating function, that converges. However, using the procedures in [8] and investigating the special case where x, = 2x,, we arrive at the following generating function:
equation
--xc
w~=x,-w~+l xw-x1 -c nf=n~
!? xj=x,-x,+1
z2x,-x1-(x,+x,)
= 0.
(32)
In Table I we present some numerical capacities for Families I, II, and III. The Family II composition blocks in Table I have (Y = 3 continuous subsets of parts. For Family III, four compositions, corresponding to x, = 2x, + i, 0 5 i 5 3 were evaluated for each d constraint and we only present the highest capacity obtained. As shown by Tang and Bahl, the characteristic equation of the binary d constrained channel is zd+‘-ZLl=o
+1
c(t)=
d
N(l-(xi+xj))?
= x1 xi = xw - xi + 1
for 1> x,.
1
Fig. 1.
z2xw-x, -
and N(l)
0,
(27)
For example X = {2,3,4,5} and Y = {(2,2), (2,3), (3,2)} may be used to comply with the basic composition rule. Again d = x1 - 1 and k = x, - 1. Note that the transition of a 1 into a 0 always violates the k constraint, but the transition of a 0 into a 1, does not always violate the d constraint. The recursions that apply are
= 1,
01-
(33)
The capacity of the d constrained channel is graphed in Fig. 1. Also depicted for various d constraints, are the capacities summarized in Table I. It can be seen in Fig. 1 that for d 2 5 the capacities of the three families are very similar and in fact closely approach the capacity of the d constrained channel. Consequently, the restriction imposed by the basic composition rule, does not lower the achievable code rates significantly for large d. Since Family I represents the simplest compositions and furthermore has the advantage that any single error violates either the d or k constraint, it may be preferred to the other families for code constructions when d is large. In Fig. 2, we show the density ratios and the detection window widths achievable with codes of Family I of rate R = m/n, m,n I 5. These
t(y)( y-1) lp(y)(jztxj ’ xw=2x1.(31)
IEEE TRANSACTIONS
ON INFORMATION
VOL.
37, NO. 5, SEPTEMBER
+:R= 2/3(1,7):DENSITYRATIO .:R= 1/2(2,7):DENSITYRATIO I: R = 213 (1,7):WINDOW x:FAMILYl:WINDOW
0.8/ 0.6
THEORY,
. ~~:1/2(2,7):WlNDOW
0.4
1991
1403
in part &, which will then be x2 = 3 or x1 = 2, respectively. In the second binary string, inversion of sa corresponds to an error in (6r = 3, while inversion of sq corresponds to an error in & = 3. Note that in the original binary string, $i + I#~ = 5, and that the ambiguity in locating the error in these latter two received strings, can be resolved by determining the correct solution for +r + & = 5, i.e., 4, = 2 and & = 3, or 4i = 3 and & = 2. Similarly, by considering all pairs of consecutive parts, and the various single errors affecting either O ’s or l’s, it can be shown that all single errors can either be located immediately, or can be located uniquely if we are able to solve the equation 4i+4i+1=5. We now associate the string of integers
d Fig. 2.
First family: Density ratios and detection window widths for
R=m/ncodes,m,n15. two fundamental limitations are also depicted for two state of the art recording codes, the R = l/2 (d, k)=(2,7) and the R = 2/3 (d, k)=(1,7) codes. By setting d 2 4, we can thus construct codes complying with the basic composition rule, while achieving a larger minimum separation between transitions than the state of the art codes, although at the expense of a smaller detection window width. IV.
CONSTRUCTIONS FOR SINGLE ERROR-CORRECTION WITHOUT CHANNEL SIDE INFORMATION
We next investigate the construction of single error-correcting (d, k) block codes, such that each codeword consists of a composition block complying with the basic composition rule, and some appended redundant bits. The assumption is made that no channel side information is available and the decoder only observes the received binary (d, k) sequences. Firstly, we present a simple example. Example: Consider a single error correcting code with composition blocks from Family I. Specifically, let each composition block be composed of parts from the set X = {2,3}. The following is an example of such a composition block: s = 101001001010100, @=23 3 22 3. Note that d = 1 and k = 2, hence two l’s in the code sequences are always separated by either one or two 0’s. As stated in Section III, for Family I, all single errors can be detected on the premise of a violation of the (d, k) constraints. Specifically, a single channel error always violates the k constraint if it transmutes a 1 into a 0, and always violates one or two adjacent d constraints if it transmutes a 0 into a 1. We next show that if we consider the composition representation a’, a single error can always be located to be within either one part, 4i, or two adjacent parts 4i and 4i+1. For example, underlining the errors, consider the reception of the strings 101001001~10100, 1010010010~0100 or 101001~01010100. Using the premise that only one bit may be inverted to restore the (d, k) constraints, the errors in the above string can be located to be within the parts 44, &, and $a, respectively. Consequently, these errors can be corrected uniquely. On the other hand, if we receive 10~001001010100 or 101~01001010100, either the bit sa or the bit sq may be inverted in each of these two strings in order to restore the (d, k) constraints. In the first binary string, inversion of either sa or sq corresponds to an error
(34)
V = *I$2 . . I/J,,= 12 . . . r, (35) with each composition block containing r parts. Hence, we can map each part 4i onto an integer lcli that denotes the order of 4i in the composition block. Assume that we use the It representation in (8) for codewords. We now append one redundant subblock to the composition block and set p1 equal to the module 2 sum of all the $i onto which one of the two parts, either xl = 2 or x2 = 3, are mapped. For example, if we choose x1: p1 = cIlri
modulo 2,
Vi 3 $i = x1.
(36)
Observe now that if we solve (34) incorrectly, it corresponds to an x1 and an x2 exchanging order, hence one of the cLi in (36) will either increase or decrease with one unit. Consequently, the parity bit pi, recomputed at the receiver, will be in disagreement with the parity bit p1 received. Note that it is also necessary to reconstruct v correctly at the decoder: at the encoder qi is associated with the ith 1 in S, and the channel may either transmute a 1 into a 0 or a 0 into a 1. However, as stated, these two types of error respectively violate the k constraint and d constraint, hence the presence and nature of such an error can be detected, 1I’ can be reconstructed correctly, and e%ican be mapped onto the correct I,/J~for all parts 4i preceding and succeeding the part(s) affected by the channel error. We thus conclude that we are now able to correct all single errors. Note that if we use the I1 representation for codewords, the k constraint is violated at the end of codewords if 4, = k + 1 and p1 = 0. By using the I, representation for codewords, with plpZ = 10 to represent p1 = 0 and p1p2p3 = 1000 to represent p1 = 1, we can construct a variable length code that preserves the k constraint. Alternatively, the k constraint can be preserved with a fixed length (nonsystematic) code with w = 5 redundant bits and pi . . . p, = 1010’ or 10’10. In both cases the asymptotic achievable code rate of Ccl, 2) = 0.4057 compares favourably to the single error correcting (1,3) constrained code in [15] that has a larger k constraint and rate R = 8/19 = 0.42. If it is possible to relax the k constraint slightly at the end of codewords, the I1 representation corresponds to the shortest codeword length, as well as the simplest decoder implementation, since parity bits can be obtained directly. A. Generalized Construction for Single Error-Correction The following is an extension and generalization code construction in the previous example. Encoding Algorithm:
of the simple
a) Map the block of information bits onto a unique composition block. Let this composition block Q = &&. . . 4, correspond to Family I.
IEEETRANSACTIONSONINFORMATIONTHEORY,VOL.37,NO.5,SEPTEMBER1YY1
= i b) Generate the string of integers v = $i~& . . . $, with I+?~ and map 4i onto IJ~. c> Determine u = w - 1 different parity bits pj, 1 I j I w - 1, pi=
c$,
Vi3$i=xj.
modulo2,
(37)
c-0 Use either the It or I, representation and append the necessary redundant bits to represent the pi in (37). Decoding Algorithm: a) Check the (d, k) constraints within the received composition block. b) No violation of (d, k) constraints within composition block. Leave composition block as it is. Go to Step h). c) Violation of (d, k) constraints within composition block. Determine nature of error. Violation of d constraint(s) implies transmutation of 0 into 1. Violation of k constraint implies transmutation of 1 into 0. d) Violation of two adjacent d constraints. Invert binary symbol 1 in center to 0. Go to Step h). e) Set up the string v and map 4, onto tii for all parts $i preceding and succeeding the part(s) affected by the channel error. f) Violation of one d constraint. Invert one of the two l’s separated by fewer than d O ’s such that neither of the runlengths of O ’s preceding or succeeding the remaining 1, violates the k constraint. Check that the recomputed parity bits pi agree with the received parity bits pi. If the parity bits pi and pi do not agree, the other 1 adjacent to the violated d constraint, is in error. g) Violation of k constraint. Invert one of the O ’s within the runlength of O ’s exceeding the k constraint, such that the (d, k) constraints are satisfied on either side of the inserted 1, and that the recomputed parity bits pj agree with the received parity bits pj. h) Map the restored composition block onto the corresponding block of information bits. If we start with Family I composition blocks and use the It representation for codewords, we append d parity bits and d2 buffer bits to each composition block, hence the total number of redundant bits is d2 + d. On the other hand, if we use the fixed length I, representation for codewords, we append w redundant bits to a composition block, such that N(w) 2 2d. For both representations, the redundancy is usually small, and always stays fixed, irrespective of the length of the composition block. The optimum length of the composition block 1 can be determined by the error rate and error distribution on the channel. The overall code rate will asymptotically approach the capacity of the (d, k = 2d) constrained channel if 1 + ~0, and in fact the capacity of the d constrained channel for large d. V.
CORRECTION
OFSINGLE ERRORS INFORMATION
WITH
CHANNEL
SIDE
Channel side information that enhances the error-detection process, may be available at the output of some (d, k) input restricted channels. An important example is the output of the peak detector in digital magnetic recorders [5], where the ternary string T = 7172 . . 5-1,
i.e., if ri + 0, ri+h #Oandritj=Oforalll~j~h-1,then Ti+h = - Ti.
(39)
Consequently, all single errors can be detected in T since (39) will be violated for some i when such an error occurs. Note now that we employed Family I composition blocks with k = 2d in our constructions for single error-correction in Section IV, in order to be able to detect and determine the approximate location of single errors in S. This information can now be obtained directly from T, and we can now construct single error-correcting codes, by starting with a composition block with any (d, k) constraints and by appending redundancy to represent the parity bits in (37). By making k > 2d, we thus obtain an increase in the code rates achievable for small d. As example, consider the drop out of a 1. This is detected since two consecutive l’s in T with the same polarity are observed. This amounts to observing the sum of two adjacent parts c = a + b. Parts $i = a and 4i = b in S need be reconstructed by inserting a 1 in the correct position. Similarly, if a 1 drops in, two consecutive l’s with the same polarity is observed and one of these need be deleted to reconstruct the two consecutive parts a and b, with a + b = c. In this case, the correct one of only two compositions presented by the channel, 4i = a,, 4i+l = 4 or A = a2, 4i+l = b, need be determined. Note that it may be ambiguous whether a 1 dropped in or out when we observe two consecutive l’s with the same polarity in T. Obviously this ambiguity may be resolved with two parity bits representing r module 4, as in Section III. However, making the incorrect choice between insertion/deletion of a 1 to ensure compliance with (39), will lead to the insertion of a new part in an odd position cLi and the deletion of one old part in an odd position. Furthermore, both r and all the +i+j, 1 I j I r - i, reconstructed at the decoder, will either increase or decrease with two units, thus the computation of the parity bits in (37) will not be affected by this change in r. Consequently, any incorrect attempt at reconstruction, will be detected with (37). We thus do not need the two parity bits equal to r modulo 4. VI.
CONSTRUCTIONS
FOR SINGLE AND DOUBLE ERROR-CORRECTION
ADJACENT
We next consider double adjacent errors. Some double adjacent errors violate the (d, k) constraints, hence these errors can be detected, located and corrected with the single error-correcting codes in the previous section. However, this does not hold for the double adjacent errors that may be due, e.g., to a peak shift that occurs in the detection process in magnetic recording, which affects a 1 in T or S and either the 0 preceding or succeeding it. These error patterns may cause two adjacent parts in @, 4h=~, and 4h+l=~i+l, or 4h+l=~i-1; X,I xi - 1, x,, xi + 1 I x, to exchange order. Consequently, such peak-shift errors can be detected but not corrected with the aid of the parity bits for the single error-correcting codes in (37). Fredrickson and Wolf [18] also investigated constructions to detect single peak-shift errors, using related though decidedly different techniques. Briefly, they associate the string of integers E = t1t2.. . ..$l= 12.. .I with any (d, k) sequence of length 1 bits, S = s1s2 . . . sI, where S has exactly x 1’s. Finally, they only retain codewords with rr either odd or even, where
~i~I--1,0,11,
(38) is observed. It is only by taking ]ri] that the binary (d, k) constrained string S is obtained. Due to the physics of the recording readback process, the l’s in T alternate in polarity,
5TTT= 5 Si, i=l
Vs,=l,
1lXl
A I
. 1
(40)
We now construct codes that has lower rates than the codes in
IEEE
TRANSACTIONS
ON INFORMATION
THEORY,
VOL.
37, NO. 5, SEPTEMBER
1991
1405
TABLE II CONSTRUCTION
I: CAPACITIES OF SOME COMPOSITION BLOCKS
Y
X
Z
t2,3) {3,4,51
((3,2x {(4,3X (4,4), (4,5))
(4,5,6,7)
{(5,4), (5,5), (5,6), $> ;ij (6,5), (6,6),
~5,fi7,8,9~
&, 51, (6,6), (6,7), (6,8), (6,9), (651, C&6), @,7), @,8), (8,9))
(6,7,8,9,10, 111
K7,6), (7,7), (7,8X (7,9), (7, lo), (8, lo), (9,6), (9,7), (9,8), (9, lo), (10,7), (10,8), (10,9), (lO,lO), (10,ll))
Sections IV and V, but that are capable of correcting all single peak-shift errors, as well as all other single and double adjacent errors. Observe that, if we allow the parts 4, = a and 41+1 = b to occur consecutively in a composition block, the location of peak-shift errors can be determined if we delete the consecutive parts 4i = a - 1 and +i+l = b + 1, as well as the consecutive parts 4i = a + 1 and 4i+1 = b - 1. We thus propose the following constructions to correct all double adjacent errors. A. Construction I for Single and Double Adjacent Error-Correction We start with a set of parts from Family I and similar to the compositional structure of Family III, we now impose restrictions on pairs of consecutive parts. We introduce the set 2 to denote all pairs of consecutive parts that may be included in @. The set Y still denotes the pairs of consecutive parts that are not allowed to occur. For example, X = {2,3), Z = {(2,2), (2,3), (3,3)1, Y=K3,2)1, or X= (3,4,51, Z =K3,3), (3,4), (3,5), (5,4), (5,3), (5,5)), Y = {(4,3), (4,4), (4,.5)} may be used for compositions. In order to maximize the achievable code rates, the number of tuples in Z should be maximized while simultaneously minimizing the number of tuples in Y. We still append redundancy to represent w - 1 parity bits, pj, 1 I j 5 w - 1, computed as in (37) to each composition block. The encoding and decoding procedures are similar to those of the single error-correcting codes. Double adjacent errors may be detected either due to violation of the (d,k) constraints or due to the transmutation of a tuple in Z into a tuple in Y. In either case, the errors can be located to one or two adjacent parts as before, and corrected with the aid of the parity bits. The composition blocks of this construction thus form a subset of the composition blocks in Family I. The following recursions apply to Construction I: N(1) = 0,
forl
(43)
(1,2) (2,4)
0.3218 0.3310
($6)
0.2940
(4,8)
0.2886
(5,101
0.2680
CZ
equation
2x,“-(x,+x,)‘=
0 >
v(x,,xj)
EZ.
(44)
The generating function is a series that does not converge and it is not presented here. Some numerical capacities are tabulated in Table II and graphed in Fig. 1. B. Construction II for Single and Double Adjacent Error-Correction In this construction, we start again with a set of parts from Family I but now delete every second part from X, starting with x2, i.e., we delete x2,, 1 I j< [w/2]. Consequently, X will finally contain either only odd parts or only even parts. Any peak-shift error will now respectively transform two adjacent odd parts into two even parts, or two adjacent even parts into two odd parts. It follows that such errors can always be detected, located to two adjacent parts, and corrected with the aid of parity bits in (37). Similar encoding and decoding procedures can be used as before. Note that if xi is odd, the initial cardinality of X is also odd, and the number of remaining parts in X exceeds the number of deleted parts with one. On the other hand, if x1 is even, the initial cardinality of X is also even, and the number of remaining parts is equal to the number of deleted parts. The reduction in the channel capacity is thus somewhat less if xi is odd. For example, X = {3,5}, X = {4,6} or X = {5,7,9}, etc., may be used for compositions. Enumeration follows if we let (45) If x1 is odd, we then obtain the following N(1) = 0,
N(l)=
forI>x,.
C
We thus obtain the characteristic
and V(xi,x,)EZ,
Cd, k)
for I< x,,
1,
i
recursive relations
0,
(46)
I= x, +2i,
Osisp.,
l=x,+2i+l,
Olilp-1,
forx,IlIx,, (47)
IEEE
1406
TABLE CONSTRUCTION
TRANSACTIONS
III
II: CAPACITIES OF COMPOSITION BLOCKS
X (3,5) (4,6) (5,7,9)
{f-i&lo) i7,9,11,13) (8,10,12,14) (9,11,13,15,17) (10,12,14,16,18)
Cd,k)
C
(2,4) (3,5) (4.8) (5,9) (6.12) (7: i3j 03, 16) (9,171
0.2556 0.2028 0.2336 0.2028 0.2074 0.1873 0.1859 0.1756
and N(Z)=
5
for l> x,.
N(I-(x,+2i)),
(48)
i=O
A generating function, similar to the generating function for Family II in (25), can be obtained using the procedures in [8]:
ON INFORMATION
THEORY,
VOL.
37, NO. 5, SEPTEMBER
1991
The single error in 0’ can thus be located to e4 with the aid of pq to p,. Consequently, we have located the channel error to two adjacent parts $4 and’& and hence @ ’ can be restored to Cp with the aid of p1 to p3. Note that in general we still need the parity bits in (37) for single error-correction. For example, let X = (4,5,6,7}. N O W if = 6 and if the (i + l)th 1 in S is transmuted into a 0, 4iz47 4r+l merging $i and $i+l into a single part 4; + 4i+ 1 = 10, it will be impossible to distinguish between 4i = 4, 4i+1 = 6 and 4; = 6, I+ i = 4 on the premise of 0 alone. Similarly, if +i = 5 and : 1+ i = 6 and a peak-shift-error shifts the (i + l)th 1 in S to the right, and hence the receiver observes $i = 6, I$~+, = 5, it will be impossible to distinguish between between +i = 7, 4i+1 = 4 and $i = 5, 4i+1 = 6 on the premise of 0 alone. Furthermore, note that since any two parity bits are separated by d 2 1 buffer bits, any pair of double adjacent errors can affect no more than one of the parity bits p4 to p7, hence not exceeding the single error-correcting code’s correction capability.
tCt-’
c(t)= 1: Et”, .
(49)
D. Construction IV for Single and Double Adjacent Error-Correction
i
The corresponding
Numerical in Fig. 1. We next composition by Family I,
characteristic
equation obtained from (48) is
capacities are tabulated
i ej = c +i modulo w,
in Table III and graphed
investigate alternative constructions such that the block complies only with the restrictions imposed but we increase the number of parity bits.
C. Construction III for Single and Double Adjacent Error-Correction We start with a composition block @ corresponding to Family I. Next we determine the binary string 0 by modulo 2 integration of the parts in Q ’, i.e., i ej = c +i modulo 2,
lljlr.
(51)
i=l
We now encode 0 with code with m 2 [l/(d + parts contained in a’, Consider the following m 2 10:
As before, we start with a composition block @, corresponding to Family I. We now determine the string of integers 0 by modulo w integration of the parts in Q ’, i.e.,
an (n, m) linear single error correcting l)]. If r < m, where r is the number of we can use a shortened linear code. example where X = {3,4,5}, I= 32, and @ = 34554353 0 = 11011010.
We may thus encode 0 with a shortened (15,ll) Hamming code. Next we use the Ii representation and append 3 redundant subblocks to @ with the parity bits pj, 1 I j I 3, determined by (37) and finally 4 more redundant subblocks with the parity bits pj, 4 I j I 7, determined by the Hamming code. At the decoder, single error-correction proceeds as before with the aid of p1 to p3. If two adjacent parts xi and xi + 1 are changed due to double adjacent errors, only one bit in 0 will change. For example, if a’ is received the 0’ is recomputed at the decoder, @ ’= 34545353 0’ = 110~1010.
l 2d constraint is otherwise feasible. On the other hand, as can clearly be seen in Fig. 1, this
ON INFORMATION
THEORY,
VOL.
31, NO. 5, SEPTEMBER
1991
loss in capacity becomes negligible for large d. Currently, the smaller detection window as depicted in Fig. 2, inhibits the implementation of modulation codes with large d, although even in conventional concatenated coding schemes, such codes potentially offer a much improved density ratio and recording density. However, future technologies may include more stable drive mechanisms and improved bit synchronization and detection procedures, and the implementation of our code constructions with k = 2d may then be attractive for these channels where a large d constraint may in fact be desirable.
REFERENCES
[l] C. E. Shannon, “A mathematical theory of communication,” Bell Syst. Tech. J., vol. 27, pp. 379-423, July 1948. [2] D. T. Tang and L. R. Bahl, “Block codes for a class of constrained noiseless channels,” Inform. Contr., vol. 17, pp. 436-461, 1970. [3] J. Riordan, An Introduction to Combinatorial Analysis. Princeton, NJ: Princeton Univ. Press, 1980. [4] H. C. Ferreira, “Lower bounds on the minimum Hamming distance achievable with runlength constrained or dc-free block codes and the synthesis of a (16,8) dmin = 4 dc free block code,” IEEE Trans. Mann., vol. MAG-20, no. 5, pp. 881-883, Sept. 1984. 151 P. H. Siegel, “Recording codes for digital magnetic storage,” IEEE no. 5, pp. 1344-1349, Sept. 1985. Trans. Magn., vol. MAG-21, 161G. Bouwhuis et al., Principles of Optical Disc Systems. Bristol, Great Britain: Adam Hilger, Ltd, 1987. [71 K. Forsberg and I. Blake, “The enumeration of (d, k) sequences,” in Proc. Twenty-Sixth Ann. Allerton Conf. on Commun. Contr. Computing, Monticello, IL, Sept. 28-30, 1988, pp. 471-472. Bl K. Forsberg, “The enumeration of constrained sequences,”Internal Rep., Dept. of Elect. Eng., Univ. of Waterloo, Ontario, Canada. [91 R. Adler, D. Coppersmith, and M. Hassner, “Algorithms for sliding block codes,” IEEE Trans. Inform. Theory, vol. IT-29, no. 1, pp. 5-22, Jan. 1983. 1101S. Lin and D. J. Costello, Error Control Coding, Fundamentals and 4pplications. Englewood Cliffs, NJ: Prentice-Hall, 1983. 1111.I. K: Wolf and G. Ungerboeck, “Trellis coding for partial-response channels,” IEEE Trans. Commun., vol. COM-34, no. 8, pp. 765-773, Aug. 1986. [I21 A. R. Calderbank, C. Heegard, and T. A. Lee, “Binary convolutional codes with application to magnetic recording,” IEEE Trans. Inform. Theory, vol. IT-32, no. 6, pp. 797-815, Nov. 1986. [I31 H. C. Ferreira, J. F. Hope, and A. L. Nel, “Binary rate four eighths, runlength constrained, error-correcting magnetic recording modulation code,” IEEE Trans. Magn., vol. MAG-22, no. 5, pp. 1197-1199, Sept. 1986. t141 P. Lee and J. K. Wolf, “Combined error-correction/modulation coding,” IEEE Trans. Magn., vol. MAG-23, no. 5, pp. 3681-3683, Sept. 1987. P. Lee, “Combined error-correcting/modulation recording codes,” Ph.D. thesis. Univ. of California at San Diego. 1988. Y. Lin and J. K. Wolf, “Combined ECC/RLL codes,” IEEE Trans. Maan.. vol. MAG-24. no. 6. pp. 2527-2529. Nov. 1988. H. C. Ferreira, D. A. Wright, and A. L. Nel, “Hamming distance preserving mappings and trellis codes with constrained binary symbols,” IEEE Trans. Inform. Theory, vol. 35, no. 5, pp. 1098-1103, Sept. 1989. 1181L. J. Fredrickson and J. K. Wolf, “Error detecting multiple block (d, k) codes,” IEEE Trans. Magn., vol. MAG-25, no. 5, pp. 4096-4098, Sept. 1989. 1191K. A Schouhamer Immink, “Runlength-limited sequences with error-detecting capabilities,” in Eighth ZERO? Znt. Conf. Video, Audio, and Data Recording, Birmingham, U.K., Apr. 23-26, 1990, to be presented. DOI 0. Ytrehus, “On error-controlling (d, k) constrained block codes,” presented at the IEEE Znt. Symp. Inform. Theory, San Diego, CA, January 14-19, 1990. -
I
__