Group Theory and Error Detecting/Correcting Codes

5 downloads 2948 Views 304KB Size Report
from the University of Surrey. Department of ... Guildford, Surrey GU2 7XH, UK. September ... financial support enabled me to concentrate solely on this project.
University of Surrey Department of Computing School of Electronics Computing and Mathematics

Group Theory & Error Detecting / Correcting Codes

Sotiris Moschoyiannis

Technical Report SCOMP-TC-02-01 December 2001

Group Theory and Error Detecting / Correcting Codes S. K. MOSCHOYIANNIS

Submitted for the Degree of Master of Science in Information Systems from the University of Surrey

Department of Computing School of Electronics, Computing & Mathematics University of Surrey Guildford, Surrey GU2 7XH, UK

September 2001 Supervised by: Dr M. W. SHIELDS

S. K. Moschoyiannis 2001

ii

iii

ABSTRACT At the dawn of the 21st century it is more than obvious that the information age is upon us. Technological developments such as orbiting and geo-stationary satellites, deep-space telescopes, high-speed computers, compact disks, digital versatile disks, high-definition television and international networks allow massive amounts of information to be transmitted, stored and retrieved. Vocationally, efficient communication of information, in terms of speed, economy, accuracy and reliability, is becoming an essential process. Since its orig ins, the field of error detecting / correcting codes arose in response to practical problems in the reliable communication of digital information. Natural communication systems such as our eyes or the English language use mechanisms to achieve reliability. Our eyes, when we are disoriented, use experience to guess the meaning of what they see, heavily depending on the various independent guessing mechanisms of our brains. The English language makes use of built-in restrictions to ensure that most sequences of letters do not form words, so as to allow very few candidates for the correct version of a misspelled word. However, when processing digital information, the lack of assumptions about the original message provides no statistic upon which to base a reasonable guess. Therefore, robust communication systems employ error detecting / correcting codes to combat the noise in transmission and storage systems. These codes obtain error control capability by adding redundant digits in a systematic fashion to the original message so that the receiver terminal can reproduce the message if altered during transmission. In order to ensure that redundancy will allow for error detection / correction, various mathematical methods are invoked in the design of error control codes. This study aims to indicate the algebraic techniques applied for developing error detecting / correcting codes. Many of these techniques are based on standard abstract algebra, particularly the theory of groups and other algebraic structures such as rings, vector spaces and matrices on the basics of finite fields. These mathematical concepts are discussed, focusing on their relation to error control coding, as they provide the basis for constructing efficient error detecting / correcting codes.

iv

ACKNOWLEDGEMENTS I would like to thank my supervisor Dr. M. W. Shields for his guidance until completion of this project. He made me feel I had the appropriate degree of freedom in conducting this study and at the same time his valuable comments and suggestions at each phase, gave me direction towards the next stage.

I would also like to thank my family for their continuous encouragement throughout this study. Their financial support enabled me to concentrate solely on this project.

Many thanks to my friend Mary for her constant support and assistance. Her printer facilitated numerous corrections on pre-drafts of my work.

v

CONTENTS Abstract......................................................................................................................................iv Contents......................................................................................................................................vi 1. Introduction............................................................................................................................1 2. Principles of Error Control.....................................................................................................3 2.1 Basic Binary Codes........................................................................................................4 2.1.1 Repetition Codes.................................................................................................4 2.1.2 Single – Parity-Check Codes..............................................................................4 2.1.3 Observation – Comparison.................................................................................4 2.2 Shannon’s Theorem......................................................................................................5 2.2.1 Observation.........................................................................................................5 2.2.2 Maximum Likelihood Decoding.........................................................................6 3. Mathematical Background I...................................................................................................7 3.1 Groups............................................................................................................................7 3.2 Fields..............................................................................................................................8 3.3 Vector Spaces.................................................................................................................9 3.4 Matrices........................................................................................................................12 4. Linear Block Codes..............................................................................................................13 4.1 Block Coding................................................................................................................13 4.1.1 Vector Representation........................................................................................14 4.2 Definition of Linear Block Codes.................................................................................14 4.3 Matrix Description........................................................................................................14 4.3.1 Generator Matrix...............................................................................................15 4.3.2 Parity-Check Matrix..........................................................................................16 4.3.3 G, H in Systematic Form...................................................................................16 4.4 Minimum Distance.......................................................................................................18 4.4.1 Definition..........................................................................................................18 4.4.2 Minimum Distance and Error Detection / Correction.......................................19 4.5 Error Processing for Linear Codes...............................................................................21 4.5.1 Standard Array...................................................................................................22 4.5.1.1 Limitations of the Standard Array Decoding.......................................23 vi

4.5.2 Syndromes of Words.........................................................................................24 4.5.2.1 Observation........................................................................................26 5. Hamming Codes – Golay Codes – RM Codes.....................................................................27 5.1 Hamming Codes..........................................................................................................27 5.1.1 Programming the [7,4] Hamming Code..........................................................28 5.1.1.1 Perl Program for Systematic Encoding..............................................29 5.1.1.2 Perl Program for Syndrome Decoding...............................................31 5.2 Golay Codes.................................................................................................................35 5.3 Reed-Muller Codes......................................................................................................36 5.3.1 Definition..........................................................................................................36 5.3.2 Properties of RM Codes....................................................................................37 5.4 Hadamard Codes..........................................................................................................37 6. Mathematical Background II...............................................................................................39 6.1 Structure of Finite Fields.............................................................................................39 6.1.1 Basic Properties of Finite Fields.......................................................................39 6.1.2 Primitive Polynomials.......................................................................................42 6.1.3 Finite Fields of Order p m...................................................................................44 6.2 Polynomials over Galois Fields..................................................................................48 6.2.1 Euclid’s Algorithm...........................................................................................48 6.2.2 Minimal Polynomials.......................................................................................50 6.2.3 Factorisation of xn – 1.......................................................................................51 6.2.4 Ideals................................................................................................................52 7. Cyclic Codes........................................................................................................................54 7.1 Polynomial Representation.........................................................................................54 7.2 Cyclic Codes as Ideals................................................................................................55 7.3 Parity-Check Polynomial – Generator Matrix for Cyclic Codes................................56 7.4 Systematic Encoding for Cyclic Codes.......................................................................59 8. BCH Codes – Reed-Solomon Codes...................................................................................61 8.1 BCH Codes.................................................................................................................61 8.1.1 Parity-Check Matrix for BCH Codes...............................................................62 8.2 Reed-Solomon Codes..................................................................................................63 8.3 Decoding Non-binary BCH and Reed-Solomon Codes..............................................66 8.4 Burst Error Correction and Reed-Solomon Codes......................................................69 vii

9. Performance of Error Detecting / Correcting Codes...........................................................71 9.1 Error Detection Performance......................................................................................71 9.2 Error Correction Performance.....................................................................................72 9.3 Information Rate and Error Control Capacity.............................................................73 10. Error Control Strategies and Applications........................................................................76 10.1 Error Control Strategies...........................................................................................76 10.2 Error Control Applications......................................................................................77 Afterword..................................................................................................................................79 Appendix A...............................................................................................................................81 Appendix B...............................................................................................................................86 Appendix C...............................................................................................................................87 Appendix D...............................................................................................................................89 Appendix E...............................................................................................................................90 Bibliography.............................................................................................................................91

viii

Group Theory and Error Detecting / Correcting Codes

Introduction

1. INTRODUCTION

Any communication system is engaged in a design trade-off between transmitted power, bandwidth and data reliability. Error detecting / correcting codes address the issue of data reliability. In addition, by reducing the undesirable effects of the noisy channel, error control codes are also interrelated with the required transmitted power and bandwidth. Therefore, error detecting / correcting codes are central to the realisation of efficient communication systems. In some cases, analysis of the design criteria for a communication system may have once indicated that the desired system is a physical impossibility. Shannon and Hamming laid the foundation for error control coding, a field which now includes powerful techniques for achieving reliable reproduction of data that is transmitted in a noisy environment. Shannon’s existential approach motivated the search for codes by providing the limits for ideal error control coding while Hamming constructed the first error detecting / correcting code. The purpose of error detecting / correcting codes is to reduce the chance of receiving messages which differ from the original message. The main concept behind error control coding is redundancy. That is, adding further symbols to the original message that do not add information but serve as check / control symbols. Error detecting / correcting codes insert redundancy into the message, at the transmitter’s end, in a systematic, analytic manner in order to enable reconstruction of the original message, at the receiver’s end, if it has been distorted during transmission. The ultimate objective is to ensure that the message and its redundancy are interrelated by some set of algebraic equations. In case the message is disturbed during transmission, it is reproduced at the receiver terminal by the use of these equations. Explicitly, error control efficiency is highly associated with applying mathematical theory in the design of error control schemes. The purpose of this study is to indicate the underlying mathematical structure of error detecting / correcting codes. The Chapters are organised as follows. Chapter 2 highlights the main principles of error detecting / correcting codes. Two examples of basic binary codes are included and the Chapter concludes with a discussion on Shannon’s Theorem.

1

Introduction

Concepts of elementary linear algebra, are introduced in Chapter 3. In particular, the theory of groups and related algebraic structures such as fields, vector spaces and matrices are selectively presented. Chapter 4 covers the basic structure of linear block codes. Based on the vector representation, these codes are postulated in terms of the mathematical entities introduced in Chapter 3. Additionally, an extensive section is devoted to the decoding process of linear block codes. The well-known Hamming codes are presented in Chapter 5, among with other classes of linear block codes such as Golay, Hadamard and Reed-Muller codes. The construction of these codes rests on the concepts introduced in Chapters 3 and 4. In this part, we have included two programs that perform encoding and decoding for the [7,4] Hamming code. Chapter 6 delves into the structure of Galois fields, aiming to form the necessary mathematical framework for defining several powerful classes of codes. The main interest is in the algebraic properties of polynomials over Galois fields. Chapter 7 proceeds to develop the structure and properties of cyclic codes. The polynomial representation of a code is used as a link between error detecting / correcting codes and the mathematical entities introduced in Chapter 6. Emphasis is placed on the application of these mathematical tools to the construction of cyclic codes. Chapter 8 is devoted to the presentation of the important classes of BCH and Reed-Solomon codes for multiple error correction. Attention is confined to their powerful algebraic decoding algorithm. The last section describes the capability of Reed-Solomon codes to correct error bursts. A discussion on basic performance parameters for error detecting / correcting codes is included in Chapter 9 for both binary and non-binary codes. Chapter 10 investigates suitable error control strategies for specific applications. ReedSolomon codes are emphasised due to their popularity in current error control systems.

2

Principles of Error Control

2. PRINCIPLES OF ERROR CONTROL Error detecting / correcting codes are implemented in almost every electronic device which entails transmission of information, whether this information is transmitted across a communication channel or stored and retrieved from a storage system such as a compact disk. The set of symbols – this set always being finite – used to form the information message, constitutes the alphabet of the code. In order to send information, a channel, be it physical or not, is required. In most of the cases presented, a random symmetric channel is considered. A channel is a random symmetric error channel if for each pair of distinct symbols a, b of the alphabet, there is a fixed probability p a,b that when a is transmitted, b is received and p a,b is the same for all possible pairs a, b (a ? b). The basic operating scheme of error detecting / correcting codes is depicted in Figure 2–1. Suppose that an information sequence of k message symbols or information digits is to be transmitted. This sequence m may be referred to as message word. The encoder, at the transmitter’s end, random error generator

m

u Encoder

u

v

?

Decoder

retransmission required

Figure 2–1 adds r check digits from the alphabet according to a certain rule, referred to as encoding rule. The encoder outputs a total sequence u of n digits, called codeword, which is the actual sequence transmitted. The n – k = r additional digits, known as parity-check digits, are the redundant digits used at the receiver’s end for detection and correction of errors. Errors that occur during transmission alter codeword u to a word v, which is the received word. The decoder checks whether the received word v satisfies the encoding rule or not. If the condition is determined to be false, then error processing is performed, in an attempt to reproduce the actual transmitted codeword u. If this attempt fails, the received word is ignored and retransmission is required, else the decoder extracts the original message m from the reconstructed codeword u.

3

Principles of Error Control

2.1 Basic Binary Codes This section reports on two primary examples of codes, employed to transmit an information sequence of 1s and 0s across the binary symmetric channel. Their simple structure illustrates the general operating principles of error detecting / correcting codes but also sheds light on certain defects of the field, addressed in the following chapters.

2.1.1 Repetition Codes Among the simplest examples of binary codes are the Repetition codes, Berlekamp (1968, p. 2), in which each bit is repeated r times. For example, we could send each bit three times. To send 011 we transmit 000111111. If we receive 000111011, an error is detected in the third bit, a reasonable action would be to change the 0 bit to 1 and the received word would be correctly suggested to be 011. Repetition codes, in general, are able to detect a double error and correct a single error. They are uneconomical, as for transmission r times as many bits as the original message are required. In other words, they have low information rate R=

k , where k are the information bits and n is the length of the total sequence of bits n

transmitted.

2.1.2 Single – Parity-Check Codes The simplest high rate codes beyond the repetition codes are the Single-Parity-Check codes, Berlekamp (1968, p. 3), which contain only one check digit. This digit is set to be the sum of the information digits, where addition is made under the binary rules 0 + 0 = 0, 0 + 1 = 1 + 0 = 1 and 1 + 1 = 0. Thus, the encoding condition is that the total number of 1s, including the check digit, is even in every codeword. In that way, the received word is checked for the number of 1s; if it is even then the codeword is decoded without change, but if it is odd then an error has occurred. The weakness of the single -parity-check codes is that retransmission is required, since there is no way to correct the error. Further, if any even number of errors occur, the word will be assumed correct even though it is not. Therefore, these codes have high information rate R – note that R approaches 1 as n increases – but are unable to detect any even number of errors and require retransmission for any odd number of errors.

2.1.3 Observation – Comparison The class of single -parity-check codes attain high rate R, but their achievement is outweighed by the loss of error correction capacity. On the other hand, the family of repetition codes use the alphabet inefficiently, low information rate R, but gain in error correcting capability from this inefficiency.

4

Principles of Error Control

2.2 Shannon’s Theorem In order to interpolate between these two extreme examples of codes, the focus is placed on codes that have both, high information rate and moderate error correction capabilities or equivalently low error probability PrE . Shannon, in 1948, proved the Fundamental Theorem of Information Theory, considered to be the starting point for coding theory. The capacity G of a channel, used in Shannon's Theorem, represents the maximum amount of information which the channel can transmit and is given by the expression G = 1 – H(p) where the function H(p), called the binary entropy of the information source, is defined as H(p) = p·log 2 p + (1 – p)·log2 (1 – p) for the binary symmetric channel (BSC) which consists of two symbols with the same probability p of incorrect transmission. The precise statement has as follows, Jones and Jones (2000, p. 88).

Theorem 2-1: Let ? be a binary symmetric channel with p >

1 , s? B has capacity 2

G = 1 – H(p) > 0 and let d,e > 0. Then, for all sufficiently large n, there is a code C ⊆ Z n2 of rate R satisfying G – e ≤ R < G, such that the nearest neighbour decoding gives error probability PrE < d. For simplicity, the statement is for the BSC, but the theorem is valid for all channels. Thus, by choosing d and e sufficiently sma ll, PrE and R can be made as close as required to 0 and G respectively. Informally, the theorem states that if long enough codewords are chosen, then information can be transmitted across a channel ? as accurately as required, at an information rate as close as desired to the capacity of the channel. Theorem 2-1 motivates the search for codes whose information rate R approaches 1 while the length of codewords n increases. Such codes are often characterised as ‘good’ codes. 2.2.1 Observation Since, a very large value of n is required to achieve R → C and PrE → 0, long enough codewords are transmitted making encoding and decoding more complex processes.

5

Principles of Error Control

Furthermore, if n is large then the receiver may experience delays until a codeword is received, resulting in a sudden burst of information, which may be difficult to handle.

2.2.2 Maximum Likelihood Decoding Shannon’s Theorem proves the existence of good codes, which are likely to have sufficient inherent structure enabling the design of efficient encoding and decoding algorithms. The concept of ‘nearest neighbour decoding’ or ‘maximum likelihood decoding’ mentioned in Theorem 2-1 deals with the process of matching an erroneous received word to the actual transmitted codeword.

Assuming that a binary codeword of length n is transmitted, the probability of a particular received word with errors in i positions is p i q n-i where q = 1 – p. Since q > p, the received word with no errors is more likely than any other. A received word with one error is more likely than one with two or more errors and so on. It can be seen that the best decision at the receiver’s end would be to decode a received word into a codeword which differs from the received word in the fewest positions.

6

Mathematical Background I

3. MATHEMATICAL BACKGROUND I In an attempt to construct effective codes, as promised by Shannon’s Theorem, much work has been done exploiting linear algebra as illustrated in the foregoing discussion. Basic theory of groups, fields, vector spaces and matrices are the main algebraic tools that provide the platform for the development of linear block codes, described in Chapters 4 and 5. Elementary notions on these topics are included in Appendix A.

3.1 Groups A fundamental concept used in the study of error detection / correction codes is the structure of a group, which underlies other algebraic structures such as fields and rings. A binary operation operates on two elements of a set at a time, yielding a third (not necessarily distinct) element. When a binary operation, along with certain rules restricting the results of the operation is imposed on a set, the resulting structure is a group. Definition 3-1: A group is a set of elements G with a binary operation ‘·’ defined in such a way that the following requirements are satisfied: 1. G is closed; that is a·ß is in G whenever a and ß are in G 2. a·(ß·?) = (a·ß)·? for all a,ß,? ∈ G 3.

(associative law)

There exists e∈ G, such that a·e = e·a = a

(identity)

4. For all a ∈ G, there exists a ∈ G such that a·a = a ·a = e -1

-1

-1

(inverse)

A group G is said to be a commutative or abelian group if it also satisfies: 5. a·ß = ß·a for all a,ß in G The order of a group is defined to be the cardinality of the group, which is the number of elements contained in the group. The order of a group is not sufficient to completely specify the group. Restriction to a particular operation is necessary. Groups with a finite number of elements are called finite groups. For example, the set of integers forms an infinite commutative group under integer addition, but not under integer multiplication, since the latter does not allow for the required multiplicative inverses.

7

Mathematical Background I

The order of a group element g ∈ G, essentially different from the order of the group, is defined to be the smallest positive integer r such that g r = e, where e is the identity element of group G. A simple method for constructing a finite group is based on the application of modular arithmetic to the set of integers as stated in the next two theorems, Wicker (1995, p. 23-4). Theorem 3-1: The elements {0, 1, 2, …, m – 1} form a commutative group of order m under modulo m integer addition for any positive integer m. As for integer multiplication, m cannot be selected arbitrarily because if the modulus m has factors other than 1 and m in a given set, the set will have zero divisors. A zero divisor is any non-zero number a for which there exists non-zero number b such that a·b = 0 modulo m. Hence, to construct a finite group of order m under multiplication modulo m, the moduli must be restricted to prime integers. Theorem 3-2: The elements {1, 2, 3, …, p – 1} form a commutative group of order (p – 1) under modulo p multiplication if and only if p is a prime integer. A subset S of G is a subgroup if it exhibits closure and contains all the necessary inverses. That is, c = a·b -1 ∈ S for all a,b ∈ S. The order of a subgroup is related to the order of the group according to Langrange’s Theorem, which states that the order of a subgroup is always a divisor of the order of the group. Another important algebraic structure in the study of error control codes is the cyclic group, defined as follows. Definition 3-2: A group G is said to be a cyclic group if each of its elements is equal to a power of an element a in G. Then, the group G is determined by . Element a is called a generating element of . The element a 0 is by convention the unit element.

3.2 Fields The concept of a field, particularly a finite field , is of great significance in the theory of error control codes as will be highlighted throughout this study. A common approach to the construction of error detecting / correcting codes suggests that the symbols of the alphabet used, are elements of a finite field. 8

Mathematical Background I

Definition 3-3: A field F is a set of elements with two binary operations ‘+’ and ‘·’ such that: 1.

F forms a commutative group under ‘+’ with identity 0

2.

F –{0} forms a commutative group under ‘·’ with identity 1

3.

The operations ‘+’ and ‘·’ distribute: a·(ß+?) = a·ß + a·? for all a,ß,?∈ F

For example, the real numbers form an infinite field, as do the rational numbers. A non-empty subset F´ of a field F is a subfield, if and only if F´ constitutes a field with the same binary operations of F. If the set F is finite, then F is called a finite field. Finite fields are often known as Galois fields, in honour of the French mathematician Evariste Galois who provided the fundamental results on finite fields. The order of a finite field F is defined to be the number of elements in the field. It is standard practice to denote a finite field of order q by GF(q). For example, the binary alphabet B is a finite field of order 2, denoted by GF(2), under the operations of modulo 2 addition and multiplication. Finite fields and their properties are further discussed in Chapter 6 since they are used in most of the known classes of error detection / correction codes.

3.3 Vector Spaces The concept of vector space over finite fields, used for defining the codes presented in the following Chapters, is introduced in this section. Consider V to be a set of elements called vectors and F a field of elements called scalars. A vector space V over a field F is defined by introducing two operations in addition to the two already defined between field elements: i.

Let ‘+’ be a binary additive operation, called additive vector operation, which maps pairs of vectors v1 ,v2 ∈ V onto vector v = v1 + v2 ∈ V

ii.

Let ‘·’ be a binary multiplication operation, called scalar multiplicative operation, which maps a scalar a ∈ F and a vector v∈ V onto a vector u = a·v∈ V

Now, V forms a vector space over F if the following conditions are satisfied: 1.

V forms an additive commutative group 9

Mathematical Background I

2.

For any element a ∈ F and v∈ V, a·v = u ∈ V

3.

The operations ‘+’ and ‘·’ distribute: a·(u + v) = a·u + a·v (a + b)·v = a·v + b·v

4.

For all a,b∈ F and all v ∈ V: (a·b)·v = a·(b·v)

5.

(associative law)

The multiplicative identity 1 in F acts as a multiplicative identity in scalar multiplication: 1·v = v for all v ∈ V

Let u,v ∈ V where v = (v0 , v1 , …, vn-1) and u = (u0 , u 1 , …, u n-1 ) with {vi }∈ F and {u i }∈ F. Then, vector addition can be defined as v + u = (v0 + u 0 , v1 + u1 , …, vn-1 + u n-1 ) and scalar multiplication can be defined as a·v = (a·v0 , a·v1 , …, a·vn-1 ) for a ∈ F, v ∈ V For example, the set of binary n-tuples, Vn , forms a vector space over GF(2), with coordinatewise addition and scalar multiplication. Note that the operations for the coordinates are performed under the restrictions imposed on the set they are taken from. Obviously, Vn has cardinality 2n , since that is the number of all possible distinct sequences of 1s and 0s of length n. Since V forms an additive commutative group, for a 0 , a 1 , …, a n-1 scalars in F, the linear combination v = a 0 v0 + a 1v1 +…+ a n-1 vn-1 is a vector in V. A spanning set for V is a set of vectors G = {v0, v1 , …, vn-1 }, the linear combinations of which include all vectors in a vector space V. Equivalently, we can say that G spans V. A spanning set with minima l cardinality is called a basis for V. If a basis for V has k elements, hence its cardinality is k, then the vector space V is said to have dimension k. Furthermore, according to the following theorem, Wicker (1995, p. 31) each vector in V can be written as a linear combination of the basis elements for some collection of scalars {a i } in F. Theorem 3-3: Let {vi } i = 0..k – 1 be a basis for a vector space V. For every vector in V there is a representation v = a 0v0 + … + a k-1vk-1 . This representation is unique. The notion of vector subspace, defined below, is fundamental to coding theory.

10

Mathematical Background I

Definition 3-4: A non-empty subset S of a vector space V is called a vector subspace of V if and only if it satisfies the following two properties: 1.

Closure under addition: x,y∈ S implies x + y ∈ S

2.

Closure under scalar multiplication: a ∈ F, x ∈ S implies a·x ∈ S

Equivalently, S is a vector subspace of the vector space V over field F if and only if a·v1 + b·v2 is in S, for v1 ,v2 ∈ S and a,b ∈ F. Definition 3-5: Let u = (u 0 , u 1 , …, un-1 ) and v = (v0 , v1 , …, vn-1) be vectors in the vector space V over the field F. The inner product u·v is defined as n −1

u·v =



u ivi = u0 v0 + u 1 v1 + … + u n-1vn-1

i= 0

The inner product defined in a vector space V over F, has the following properties derived from its definition: i.

commutative: u·v = v·u, for all u,v ∈ V

ii.

associative with scalar multiplication: a·(v·u) = (a·v)·u

iii.

distributive with vector addition: u·(v + w) = u·v + u·w

If the inner product of two vectors v and u is v·u = 0, then v is said to be orthogonal to u or equivalently u is orthogonal to v. The inner product, which is a binary operation that maps pairs of vectors in the vector space V over field F onto scalars in F, is used to characterise dual (null) spaces. Given that a vector space V over F is a vector space with inner product, the dual space S- is defined as the set of all vectors v in V such that u·v = 0, for all u ∈ S and for all v ∈ S- . Note that S and S- are not disjoint since they both contain the vector the coordinates of which are all zero, denoted by 0. Additionally, the Dimension Theorem imposes that the summation of the dimensions of S and S- , is equal to the dimension of the vector space V. 11

Mathematical Background I

3.4 Matrices It is common practice in error control systems to employ matrices in the encoding and decoding processes. A kx n matrix G over a Galois field GF(q) is a rectangular array with k rows and n columns where each entry g ij is an element of GF(q) (i = 0..k – 1 and j = 0..n – 1). If k ≤ n and the k rows of a matrix G are linearly independent, the q k linear combinations of these rows form a k-dimensiona l subspace of the vector space Vn of all n-tuples over GF(q). Such a subspace is called the row space of matrix G. Furthermore, by using the notion of dual space, previously introduced, an important theorem implies the existence of an (n – k)xn matrix H for each kx n matrix G with k linearly independent rows. The precise statement of this theorem, Lin and Costello (1983, p. 47), which will appear rather useful in section 4.3 concerning the matrix description of a code, has as follows. Theorem 3-4: For any k xn matrix G over GF(q) with k linearly independent rows, there exists an (n – k)x n matrix H over GF(q) with (n – k) linearly independent rows such that for any row g i in G and any hj in H, g i ·hj = 0. The row space of G is the null (dual) space of H, and vice versa.

12

Linear Block Codes

4. LINEAR BLOCK CODES Most known codes are block codes (or codes of fixed length). These codes divide the data stream into blocks of fixed length which are then treated independently. There also exist codes of non-constant length such as the convolutional codes which offer a substantially different approach to error control. In these codes, redundancy can be introduced into an information sequence through the use of linear shift registers which convert the entire data stream, regardless of its length, into a single codeword. In general, encoding and decoding of convolutional codes depends more on designing the appropriate shift register circuits and less on mathematical structures. Therefore, in this study, attention is confined to block codes, which invoke algebraic techniques mainly based on the theory of groups to insert redundancy into the information sequence.

4.1 Block Coding In block coding, the information sequence is segmented into message blocks of fixed length k. These message blocks are encoded independently at the transmitter’s end, decoded in the same manner at the receiver’s end and then combined to retrieve the original message. Using an alphabet of q symbols, where the collection of these q symbols is considered a Galois field of order q, GF(q), there are q k distinct message blocks. The encoder, according to certain rules, transforms each message block m of length k into an n-tuple u which is the codeword to be transmitted, as depicted in Figure 4–1.

m = m0 m1 … mk-1 →

Encoder

→ u = u 0 u 1 … u n-1

Figure 4–1 The length n of the codeword is greater than k and these (n – k) additional digits, often referred to as parity-check digits, are the redundancy added. In order to ensure that the encoding process can be reversed in the receiver’s end in order to retrieve the original message, there must be a one-to-one correspondence between a message block and its corresponding codeword. This implies that there are exactly q k codewords. The set of q k codewords of length n is called an (n,k) block code.

13

Linear Block Codes

4.1.1 Vector Representation In the study of block codes it is useful to associate codewords with vectors. Each codeword of length n, can be represented by a vector of dimension n u = u 0 u 1 ...u n-1 ↔ u = (u 0 , u1 , ..., u n-1 ) A codeword u, which is an n-tuple is represented by vector u whose n coordinates are the components of the codeword. This representation allows us to exploit the algebraic structures introduced in Chapter 3 and provides the basis for defining linear block codes.

4.2 Definition of Linear Block Codes In general, encoding and decoding of q k codewords of length n may become prohibitively complex processes for large n and k. All codewords need to be stored and searched through for each received word. The linear algebraic structure of a major class of codes, called linear block codes, can be exploited in designing their encoding and decoding schemes. The inherent property of linearity of these codes means that they have mathematical structure resulting in a reduction in the complexity of their implementation and analysis. Based on the definition of linear block codes over GF(2) in Lin and Costello (1983, p. 52), these codes can also be defined over a general finite field alphabet as follows. Definition 4-1: A block code of length n and q k codewords is called a linear (n,k) code C, if and only if its q k codewords form a k-dimensional subspace of the vector space of all n-tuples over the field GF(q). Based on the properties of vector spaces, discussed in section 3.3, it can be seen that the linear combination of any set of codewords is a codeword. This implies that the sum of any two codewords in C is a codeword in C. Another consequence of this is that linear codes always contain the all-zero vector, 0, as a codeword.

4.3 Matrix Description The vector representation of each codeword combined with Definition 4-1 allows for the matrix description of an (n,k) code. The encoding and decoding processes can be reduced to

14

Linear Block Codes

matrix multiplication, as illustrated in the foregoing discussion, resulting in a substantial decrease in complexity.

4.3.1 Generator Matrix By Definition 4-1, an (n,k) code is a k-dimensional subspace of the vector space of all n-tuples over GF(q). Thus, it is possible to find k linearly independent codewords g 0 , g 1 , …, g k-1 in C. The set of these codewords forms a basis for code C since it consists of k linearly independent elements of C. By use of Theorem 3-3, every codeword c in C is a linear combination of the basis elements and thus can be written as

c = m0 g0 + m1 g1 + … + mk-1 g k-1 where {mi } i = 0..k – 1 are in GF(q). The above expression is valid for any codeword c in C, implying that there is a one-to-one correspondence between the set of message blocks of the form (m0 , m1 , ..., mk-1 ) and the codewords in C. Therefore, all codewords in C can be formed by the linear combination of k linearly independent codewords in C. As a result, for any (n,k) linear code there exists a kxn matrix G, whose rows are these k linearly independent codewords

 g 0, 0  g0     g  g1, 0  1   .   =  . G =    .   .    .  .     g  g k −1, 0 k −1  

g 0 ,1 g1,1

g k −1,1

g 0, n−1   . . . g1,n −1        . . . g k −1,n −1  . . .

Clearly, the rows of G completely specify the code C. The matrix G is called a generator matrix for code C and is used for encoding any message m = (m0 , m1 , ..., mk-1 ) as

 g0     g1   .   = m g +m g +…+ m g c = m·G = [m0 m1 ... mk-1 ] ·  0 0 1 1 k-1 k-1  .     .     g k −1  15

Linear Block Codes

The generator matrix is central to the description of linear block codes since the encoding process is reduced to matrix multiplication. In addition, only the k rows of G need to be stored; the encoder merely needs to form a linear combination of the k rows based on the input message m = (m0 , m1 , ..., mk-1 ). Likewise, the decoding process can be simplified by the use of another matrix, the parity-check matrix , introduced next.

4.3.2 Parity-Check Matrix A matrix description of the decoding process can be obtained by the use of Theorem 3-4 and the notion of dual space described in section 3.3.

As stated in Theorem 3-4, for any k x n matrix G with q linearly independent rows there exists an (n – k)x n matrix H with (n – k) linearly independent rows such that any vector orthogonal to the rows of H is in the row space of G and thus a valid codeword.

 h0, 0  h0      h  1   h1, 0  .    = H =   .  .      .  .   .     h n− k −1  hn− k −1, 0

h0,n −1    h1,1 . . . h1, n−1        hn− k −1,1 . . . hn− k −1, n−1  h0,1 . . .

Such a matrix H is called a parity-check matrix of the code and is used for decoding since a received word v is a codeword if and only if v·HT = 0, where HT is the transpose1 of H. In addition, the (n – k) linearly independent rows of H span an (n – k)-dimensional subspace of the vector space of all n-tuples over GF(q). It can be seen that this is the dual space of the vector space formed by the (n,k) code. Thus, H can be regarded as a generator matrix for an (n,n – k) code. This code, with regard to the notion of dual space, is called the dual code Cof C. 4.3.3 G, H in Systematic Form The construction process of the generator matrix G and the parity check matrix H of a linear code implies that they are not unique. Thus, choosing them to have as simple a form as

1

The transpose of H, is an n x (n – k) matrix whose rows are the columns of H and whose columns are the rows of H.

16

Linear Block Codes

possible is a mathematical challenge. In addition, the problem of recovering the message block from a codeword can be simplified, if G and H have a special form. Consider a linear code C with generator matrix G. By applying elementary row operations 2 and column permutations it is always possible to obtain a generator matrix of the form G = [Pkx(n-k) | Ik] where Ik is the kxk identity matrix. Matrix G in the above expression is said to be in systematic form. Note that, by permuting the columns of G, code C may change but the new code C´ will differ from C only in the order of symbols within codewords, allowing the two codes not to be determined as essentially different. In fact, such codes are called equivalent codes. If the generator matrix G is in systematic form, the message block during encoding is embedded without modification in the last k coordinates of the resulting codeword. The (n – k) first coordinates contain a set of (n – k) symbols that are linear combinations of certain information symbols. This set is determined by matrix Pk x(n-k), which can be stored in the read-only memory (ROM) of a PC. c = m·G = [m0 m1 ... mk-1 ]·[Pkx(n-k) | Ik] = [c0 c1 ... cn-k-1 | m0 m1 ... mk-1 ] Given a generator matrix G in systematic form, a corresponding parity-check matrix in systematic form can be obtained as H = [In-k | PT] The use of G and H in systematic form has implementation advantages. An encoder of binary systematic linear codes, according to Reed and Chen (1999, p. 82) can be implemented by a logic circuit, as illustrated in Appendix D. Additionally, the decoding process is simplified since a received word contains information in the last k positions and thus in case of correct transmission, the original message can be reconstructed by simply extracting the last k coordinates. 2

Elementary row operations are defined to be: i. multiplication of a row by a non-zero constant ii. replacement of a row ri with ri + a·rj , where i ≠ j and a ≠ 0 iii. row permutations (row reordering)

17

Linear Block Codes

4.4 Minimum Distance The constructs developed in the previous section, are mainly concerned with the realisation of encoding and decoding schemes. In this section, an important parameter that determines the error detection / correction capacity of an error control code is introduced, called minimum distance.

4.4.1 Definition Effective codes tend to use codewords that are very unlike each other, since such a property is required for applying maximum likelihood decoding. Clearly, there is a necessity to measure how like or unlike two codewords are. A notion of distance between two codewords known as Hamming distance is defined as follows, van Lint (1999, p. 26). Definition 4-2: If u and v are two n-tuples, then we shall say that their Hamming distance is d H (u,v) = | { i | 0 ≤ i ≤ n – 1, u i ≠ vi } | In short, the Hamming distance is the number of places (i = 0..n – 1) in which two codewords differ. It is a metric; that is, it satisfies the following properties that a distance function must satisfy in the set of codewords of a code C: 1. d H (u,v) ≥ 0 for all n-tuples u,v 2. d H (u,v) = 0 if and only if u = v 3. d H (u,v) = d H (v,u) 4. For any three n-tuples u, v, w in C: d H (u,w) ≤ d H (u,v) + d H (v,w)

(triangle inequality)

The Hamming distance d H is calculated for all possible pairs of codewords of a code C and much attention is given to its minimum value.

The minimum distance, d, among all

codewords is considered to be the least Hamming distance, Reed and Chen (1999, p. 85). Definition 4-3: The minimum distance of a block code C is the minimum Hamming distance between all distinct pairs of codewords in C d = min(d H(u,v)), for all u,v in C

18

Linear Block Codes

Calculating distances among q k codewords is a quite tedious task, which may be simplified by the notion of weight defined as follows, van Lint (1999, p. 33). Definition 4-4: The weight w(v) of any vector (codeword) v = v0 v1…vn-1 is defined by w(v) = d H (v,0) where we denote (0, 0, …, 0) by 0. In other words, the weight w(v) of a codeword v is the number of non-zero coordinates in v. Now, the minimum distance of a code can be obtained by using the following theorem, Lin and Costello (1983, p. 63). Theorem 4-1: The minimum distance of a linear block code is equal to the minimum weight of its non-zero codewords. Consequently, finding the minimum distance of a code requires the weight structure of q k codewords rather than computing the Hamming distance for all q 2k pairs of codewords. Another way to determine the minimum distance of a code is by using the parity-check matrix H, described in section 4.3.2. It has been seen that if c is a codeword then c·HT = 0. Further, c·HT can be written as a linear combination of the columns of H or the rows of HT. Thus, the equation c·HT = 0 implies that the columns of H are linearly dependent. These results lead to the following theorem, Jones and Jones (2000, p. 131), which provides an alternative way of determining the minimum distance of a code. Theorem 4-2: Let C be a linear code of minimum distance d and let H be a parity-check matrix for C. Then d is the minimum number of linearly dependent columns of H. 4.4.2 Minimum Distance and Error Detection / Correction As mentioned earlier, the minimum distance of an (n,k) code determines its error detecting / correcting capacity. If d is the minimum distance of a code, any two distinct codewords differ at least in d coordinates. For such a code, a received word with (d – 1) or fewer errors cannot be matched

19

Linear Block Codes

to a codeword. Hence, the code is capable of detecting all (d – 1) or fewer errors that may occur during transmission. When applying maximum likelihood decoding, incorrect decoding may occur whenever a received word is closer, in Hamming distance, to an incorrect codeword than to the correct codeword. For an (n,k) code with minimum distance d, all incorrect codewords are at least distance d from the transmitted codeword. This implies that incorrect decoding may be the case only when at least d / 2 errors occur. This is stated in an equivalent fashion in the following theorem, Pless (1998, p. 11).

Theorem 4-3: If d is the minimum distance of a code C, then C can correct t =+

d −1 +or 2

fewer errors3 , and conversely. Equivalently, the above theorem imposes the condition that d and t satisfy the inequality d ≥ 2t + 1 for a code to be t-error-correcting. That is, a code can correct any word received with errors in at most t of its symbols. It is also possible to employ codes, which can simultaneously perform correction and detection. Such modes of error control, often referred to as hybrid modes of error control, are commonly used in practice. It can be shown, by use of geometric arguments according to Reed and Chen (1999, p. 86) that a code with minimum distance d can correct t errors and at the same time detect l errors where t, l and d satisfy t + l + 1 ≤ d. A Hamming sphere of radius t contains all possible received words that are Hamming distance t or less from a codeword. If a received word falls within a Hamming sphere it is decoded as the codeword in the center of the sphere. The common radius t of these disjoint4 spheres is desired to be large in order to achieve good error correcting capacity. Yet, to attain good information rate R, the number of these spheres M must also be large resulting in a conflict, since the spheres are disjoint. Thus, there is a limit on the number of spheres, and consequently codewords, that can be used. An upper bound, known as Hammming’s Spherepacking Bound, addresses the problem as stated in the following theorem, Pless (1998, p. 23).

3 4

+ y+ is the largest integer less than or equal to y (floor function). The requirement that the Hamming spheres are disjoint allows only one candidate codeword for each received word (unambiguous decoding).

20

Linear Block Codes

Theorem 4-4: If C is a code of length n with dimension k, minimum distance d and M = q k codewords, then

 n   n  n  n (  +  (q − 1) +   (q − 1)2 + ... +  (q − 1)t ) q k ≤ q n  0  1   2 t  Hence, given n and k this expression bounds t and so bounds d. An important result that rests on Theorem 4-4 introduces the notion of t-perfect codes, as presented by Jones and Jones (2000, p. 108). Theorem 4-5: Every t-error-correcting linear (n,k) code C over GF(q) satisfies

t

 n

∑  i (q − 1)

i

≤ q n− k

i= 0

A linear code C is t-perfect if it attains equality in the above theorem. Note that the inequality in Theorem 4-5 can be obtained by dividing the inequality in Theorem 4-4 by the number of all codewords, q k, since C is of dimension k. Based on this condition for t-perfect codes, Golay constructed the two perfect codes, G11 and G23 , as will be described in section 5.2. From the discussion argued in this section, it can be concluded that the minimum distance is central to the structure of error detecting / correcting codes. Its tight relation to the error control capacity of a code, motivates the search for codes with a large minimum distance for given length n and dimension k.

4.5 Error Processing for Linear Codes Effective codes are highly associated with applying an efficient decoding algorithm. In general, decoding schemes designed for a specific code seem to be efficient for the certain code or class of codes, rather than for any code. In short, there is no ‘one size fits all’ for decoding algorithms. However, syndrome decoding is a method that achieves comfortable efficiency for most codes. Therefore, it can be used as a common platform for comparison with a decoding scheme for a specific code or family of codes. It is a method of error processing for linear codes that always produces a closest codeword - that is, complete decoding - by using the parity-check matrix for efficient implementation of maximum likelihood decoding.

21

Linear Block Codes

4.5.1 Standard Array The first stage for implementing the decoding process for linear codes is the construction of an array T contain ing all the words over the finite field GF(q), which is the alphabet of the code. Put simply, T consists of all n-tuples with components from the field GF(q). This array is constructed as follows. Step 1: The first row of T consists of all the codewords in any order with the restriction that the first word is 0, the all-zero vector, which has all entries equal to zero. Step 2: The i-th row is formed by choosing an n-tuple in GF(q) that has not appeared yet in T and placing it in the first column. This word is called the row leader. The rest of the row is determined by adding the codeword at the head of each column – it is an element of the first row – to the row leader. Following the above steps a q kxq n-k array T is formed, called a standard array for the linear (n,k) code. Clearly, T contains all words of length n from GF(q) where q is the number of code symbols and n is the length of the code. From the way in which the standard array is constructed – that is, every codeword is the sum of its row leader and the codeword at the head of its column – it follows that the horizontal differences in a standard array are codewords. This property enables a testing of whether two codewords u and v lie in the same row of the standard array or not, according to the following theorem, Pretzel (1996, p. 50). Theorem 4-6: Let C be a linear (n,k) code and T a standard array for C. Then two entries u and v lie in the same row of T if and only if their difference u – v is a codeword. Error processing with the standard array is performed by replacing each received word with the codeword at the head of its column. An important property that rests on the above theorem is that every word of length n for C, occurs exactly once in a standard array for code C. As a result, standard array decoding is complete and unambiguous. By ‘complete’ is meant that every received word is always matched to a codeword and ‘unambiguous’ refers to the fact that the received word clearly determines the corresponding codeword, not allowing any arbitrary choice.

22

Linear Block Codes

The set of elements that appear in the same row is called a coset. Thus, the cosets of a code can be defined as the rows of the standard array. These cosets are the same for any possible standard array of a code, because the test whether two entries lie in the same row, is based on the set of codewords of the code and not on the actual choice of the array. The entries in the first row may be in different order, but the sets of elements are the same. Consequently, the rest of the rows in the standard array have the same entries, even though their order may differ since ordering depends on the choice of row leader for each row. This property of the standard array is important, as the error processor associates the whole row to its row leader. Once the error processor using a standard array detects an erroneous received word, instead of assuming which codeword was sent, it attempts to guess which error occurred. Then, based on the fact that an incorrect received word v equals to the transmitted codeword u plus an error pattern e, where e = v – u, the word v is corrected to the codeword u. To accomplish that, the error patterns are chosen to be the row leaders of the standard array. In this way, the word at the head of the column containing e is 0. Thus, e – 0 = v – u, where u is the word at the head of the column containing v. Hence, u = v – e and the transmitted codeword u is determined. Since a standard array for a linear (n,k) code cannot always have all possible error patterns as row leaders, decoding with a standard array does not enable correction of all errors. In an attempt to ease this deficiency of the standard array, the row leaders can be chosen to be of minimal weight among all candidates. In this case, the row leaders are called coset leaders. However, the choice may be arbitrary in many cases leading to different versions of the standard array, but the cosets will consist of the same words for the same coset leader in a different form of the standard array. In this way, each received word will be corrected to a closest codeword. The above argument is justified in the following theorem, Pretzel (1996, p. 54). Theorem 4-7: Let C be a linear (n,k) code over A with a standard array T in which the row leaders have the smallest possible weight. Let u be a word of length n and let v be the codeword at the head of its column. Then for any codeword w the distance d(u,w) is greater than the distance d(u,v).

4.5.1.1 Limitations of the Standard Array Decoding Error processing with the standard array is a relatively simple method with simple practic al implementation, since it provides a direct match between the received word and the corresponding codeword. However, there are certain limitations. The size of the standard 23

Linear Block Codes

array often becomes prohibitive. Codes need long block length in order to achieve multiple error correction, resulting in standard arrays with, say 2100 entries, which is not feasible on computers, in terms of storage space and required time for searching for a specific entry. Additionally, in the standard array decoding technique, the error patterns corrected are the row (coset) leaders. As a result, if two error patterns lie in the same row (coset), at most one of them can be corrected. With regard to the limitation in the error correction capabilities of the code imposed by the standard array, a set of error patterns is chosen, usually the set of all errors up to some given weight. The elements of the set are chosen to be the coset leaders, therefore lie in distinct rows allowing the standard array to correct all of them and no more.

In short, the

corresponding code can correct all error patterns of a certain weight so long as they appear as coset leaders. Regarding the size of the standard array, the notion of syndromes of received words reduces the large standard array to a table that consists of two rows, as discussed next, resulting in saving storage space.

4.5.2 Syndromes of Words The underlying relationship between the standard array and the check matrix of a code, can be exploited to reduce the size of the standard array and determine which errors the code can correct. Recall that the basic property of the check matrix is that a received word v is a codeword if v·HT = 0. It follows that two words u and v lie in the same coset if and only if u·HT = v·HT. In the standard array decoding method, the main interest is in specifying the row that contains the received word and then, the error pattern is assumed to be the row leader. Since v·HT determines the row of the received word v, there is no need to store all the elements but only this value, which is the same for all elements of a specific row and its row leader. Thus, the value v·HT, determines the error pattern and according to Pretzel (1996, p. 56) can be defined as follows. Definition 4-5: Given a linear (n,k) code C and a check matrix H, the syndrome of a word v of length n is v·HT. Consequently, the standard array can be reduced to a table containing the row leaders and syndromes only, resulting in reduced storage space while the search to locate a received word is less time-consuming.

24

Linear Block Codes

The received words and the corresponding error patterns have the same syndromes. Indeed, supposing that codeword u is transmitted and v = u + e is received, where e is the error pattern, the syndrome of v is v·HT = (u + e)·HT = u·HT + e·HT = 0 + e·HT since u ∈ C and u·HT = 0 because u is a codeword. The basic property of the check matrix can be restated using the notion of syndromes by saying that a received word v is a codeword if and only if its syndrome v·HT = 0. The syndrome of each received word is computed and located in the syndrome list of the syndrome table. Then, the corresponding coset leader is the error pattern to be subtracted from the received word. In this way, the actual codeword sent is reconstructed. For example, let G be the generator matrix for a binary code C,

1 0 1 0 

G=   0 1 1 1  and H can be taken to be the check matrix,

1 1 1 0 

H=   0 1 0 1 The first row of a possible standard array consists of the codewords with 0 first,

codewords

0000

1010

0111

1101

cosets

1000

0010

1111

0101

0100

1110

0011

1001

0001

1011

0110

1100

The syndrome table consisting of row leaders and their syndromes, coset leaders

syndromes

0000

00

1000

10

0100

11

0001

01

25

Linear Block Codes

Supposing that u = 1101 is transmitted and v = 1100 is received, the syndrome of v is computed as v·HT = [0 1]. Syndrome 01 is at the fourth row of the table and its corresponding error pattern (coset leader) is e = 0001. Hence, the correct codeword is determined by subtracting the error pattern from the received word, hence u = v – e = 1101, which is the actual codeword transmitted. Note that the error pattern 0010 was not chosen as a coset leader, because it had already appeared in a coset, at the second row, of the standard array. The syndrome of the coset leader 1000 is identical to the first column of the check matrix. Likewise, the syndrome of 0100 is the second column of H and that of 0001 is the fourth column of the check matrix. The position of 1 in the error pattern indicates the column and consequently the position of error in the received word. In the previous example, the error pattern was 0001 and the error symbol was in the fourth position of the received word. 4.5.2.1 Observation Syndrome decoding is more efficient than standard array decoding in terms of storage space and speed of error processing. Considering a binary (100, 60) code, for standard array decoding there is a need to store 2100 entries and search through 260 vectors to locate a received word. By implementing syndrome decoding, the requirement is to store and search through 240 coset leaders and their syndromes, resulting in reduced storage space and a less time-consuming process.

26

Hamming Codes - Golay Codes - RM Codes

5. HAMMING CODES – GOLAY CODES – RM CODES 5.1 Hamming Codes Hamming codes – discovered by Hamming shortly after Shannon’s Theorem motivated the search for error detecting / correcting codes – comprised the first family of codes capable of correcting errors. These codes have been used for error control in digital communication systems and computer memories. As described by Reed and Chen (1999, p. 104), a binary Hamming code can be constructed for any positive integer m ≥ 2, with the following parameters. n = 2m– 1

code length

k = 2m – m – 1

information symbols minimum distance

d=3

Hamming codes are determined by their parity-check matrices. The parity-check matrix H of a binary Hamming code, consists of all non-zero m-tuples as its columns which are ordered arbitrarily. The smallest number of distinct non-zero binary m-tuples that are linearly dependent is three and this can be justified as follows. Since the columns of H are non-zero and distinct, no two columns add to zero. In addition, H has all the non-zero m-tuples as its columns and thus the vector sum of any two columns must be a column of H as well. This implies that hi + hj + hk = 0, which means that at least three columns in H are linearly dependent. It follows from Theorem 4-2 that the Hamming codes always have minimum distance three. Further, Theorem 4-3 implies that Hamming codes can always correct a single error. Hamming codes can be decoded with syndrome decoding, developed in section 4.5. Suppose that r is the received word and e is the error pattern with a single error in the j-th position. When the syndrome of r is computed, the transposition of the j-th column of H is obtained as demonstrated in the following expression T

s = r·HT = e·HT = h j

Thus, the decoding process can be performed as follows:

27

Hamming Codes - Golay Codes - RM Codes

Step 1: Compute the syndrome s If s = 0 then go to Step 4 Step 2: Determine the position j of the column of H which is the transposition of the syndrome Step 3: Complement the bit in the j-th position in the received word Step 4: Output the resulting codeword

5.1.1 Programming the [7,4] Hamming Code It has been seen that an [n,k] Hamming code can be constructed for any positive integer m ≥ 2. By choosing m = 3 the resulting code with the following parameters, code length

n=7

information symbols

k=4

minimum distance

d=3

is the [7,4] Hamming code. This code encodes message words of length 4 as codewords of length 7. A table of all 24 message words and their corresponding codewords, taken by Lin and Costello (1983, p.67) was rather useful for testing the programs presented in the following sections and is included in Appendix E. This table can be constructed by using the following equations (encoding rule) based on Pretzel (1996, p. 67): c1 = m1 + m3 + m4 c2 = m1 + m2 + m3

(5.1.1)

c3 = m2 + m3 + m4 where {mi } denote the coordinates of the message word m = (m1 m2 m3 m4 ).

The

corresponding codeword is formed as c = (c1 c2 c3 c4 c5 c6 c7 ) where c1 , c2 , c3 are determined by the above equations and c4 = m1 , c5 = m2 , c6 = m3 and c7 = m4 . For example, the message word m = (0 1 0 1) is encoded as c = (1 1 0 0 1 0 1) since c1 = m1 + m3 + m4 = 0 + 0 + 1 = 1 c2 = m1 + m2 + m3 = 0 + 1 + 0 = 1 c3 = m2 + m3 + m4 = 1 + 0 + 1 = 0 28

Hamming Codes - Golay Codes - RM Codes

Note that the information symbols – that is, the message word coordinates – occupy the last 4 positions of the corresponding codeword and therefore, equations (5.1.1) perform systematic encoding for the [7,4] Hamming code. 5.1.1.1 Perl Program for Systematic Encoding This program, written in Perl programming language, performs systematic encoding for the [7,4] Hamming code.

It is based on the encoding rule described above and uses the

corresponding generator matrix G in systematic form,

1 0 G=  1  1

1 1 1 0

0 1 1 1

1 0 0 0

0 1 0 0

0 0 1 0

0 0 0  1

Though comments have been added to the source code for clarity, a brief description of the program follows. The program takes a message word as input from the user. This message word, consisting of 4 bits, is encoded through matrix multiplication by the generator matrix G of the code, where the operations are performed modulo 2. The output of the program is the corresponding codeword. #!/usr/bin/perl -w #HamSysEncoder.pl #the program performs systematic encoding for the [7,4] Hamming Code #it uses the generator matrix G in systematic form

#initialisation of generator matrix G in systematic form while (){ @g0=qw(1 1 0 1 0 0 0); @g1=qw(0 1 1 0 1 0 0); @g2=qw(1 1 1 0 0 1 0); @g3=qw(1 0 1 0 0 0 1);

#input message word from user (any binary 4-tuple) #assign it to @message print "Enter a message word using spaces between the bits:\n"; chomp($mword=);

29

Hamming Codes - Golay Codes - RM Codes

if (!(defined($mword))){ print "Enter a message word first:\n"; } else { @message=split(/\s+/,$mword); }

#multiply message word by matrix G

#for each message word coordinate=0, make the corresponding G row=0 #else leave it as it has if ($message[0]==0){ for($j=0;$j