AN APPROACH TO RAID-6 BASED ON CYCLIC GROUPS

15 downloads 6885 Views 227KB Size Report
Mar 27, 2010 - an MTBF is actually achieved and that the drives fail independently, the ... For every N bits of user data written on N disks of the array, a parity bit equal to an exclusive OR ... Binary content of any disk can be then recovered as a ...... http://www.fujitsu.com/downloads/COMP/fcpa/hdd/sata-mobile-ext-duty ...
AN APPROACH TO RAID-6 BASED ON CYCLIC GROUPS

arXiv:cs/0611109v3 [cs.IT] 27 Mar 2010

ROBERT JACKSON, DMITRIY RUMYNIN, AND OLEG V. ZABORONSKI Abstract. As the size of data storing arrays of disks grows, it becomes vital to protect data against double disk failures. A popular method of protection is via the Reed-Solomon (RS) code with two parity words. In the present paper we construct alternative examples of linear block codes protecting against two erasures. Our construction is based on an abstract notion of cone. Concrete cones are constructed via matrix representations of cyclic groups of prime order. In particular, this construction produces EVENODD code. Interesting conditions on the prime number arise in our analysis of these codes. At the end, we analyse an assembly implementation of the corresponding system on a general purpose processor and compare its write and recovery speed with the standard DP-RAID system.

1. Introduction A typical storage solution targeting a small-to-medium size enterprise is a networked unit with 12 disk drives with total capacity of around 20 TB [13, 9]. The volume of information accumulated and stored by a typical small-size information technology company amounts to fifty 100-gigabyte drives. The specified mean time between failures (MTBF) for a modern desktop drive is about 500,000 hours [14]. Assuming that such an MTBF is actually achieved and that the drives fail independently, the probability of a disk failure in the course of a year is 1 − e−12/57 ≈ 0.2. Therefore, even a small company can no longer avoid the necessity of protecting its data against disk failures. The use of redundant arrays of independent disks (RAIDs) enables such a protection in a cost efficient manner. To protect an array of N disks against a single disk failure it is sufficient to add one more disk to the array. For every N bits of user data written on N disks of the array, a parity bit equal to an exclusive OR (XOR) of these bits is written on the (N + 1)-st disk. Binary content of any disk can be then recovered as a bitwise XOR of contents of remaining N disks. The corresponding system for storing data and distributing parity between disks of the array is referred to as RAID-5 [8]. Today, RAID-5 constitutes the most popular solution for protected storage. As the amount of data stored by humanity on magnetic media grows, the danger of multiple disk failures within a single array becomes real. Maddock, Hart and Kean argue that for a storage system consisting of one hundred 8 + P RAID-5 arrays the rate of failures amounts to losing one array every six months [8]. Because of this danger, RAID-5 is currently being replaced with RAID-6, which offers protection against double failure of drives within the array. RAID-6 refers to any technique where two strips of redundant data are added to the strips of user data, in such a way that all the information can be restored if any two strips are lost. A number of RAID-6 techniques are known [4, 8, 10].A well-known RAID-6 scheme is based on the rate-255/257 Reed-Solomon code [1]. In this scheme two extra disks are introduced for up to 255 disks of data and two parity bytes are computed per 255 data bytes. Hardware implementation of RS-based RAIDe = GF (256), which are byte-based. Addition of bytes is just a bitwise 6 is as simple as operations in F XOR. Multiplication of bytes corresponds to multiplication of boolean polynomials modulo an irreducible polynomial. Multiplication can be implemented using XOR-gates, AND-gates and shifts. Some RAID-6 schemes use only bitwise XOR for the computation of parity bits by exploiting a twodimensional striping of disks of the array. Examples are a proprietary RAID-DP developed by Network Appliances [7] and EVENODD [2]. Some other RAID-6 methods use a non-trivial striping and employ only Date: June 3, 2008, revised on March 26, 2010. 1991 Mathematics Subject Classification. Primary 94B60; Secondary 11A41. Key words and phrases. RAID-6, Artin’s conjecture, Mersenne prime, Reed-Solomon Code. Oleg V. Zaboronski was partially supported by Royal Society Industrial Fellowship. 1

XOR operation for parity calculation and reconstruction. Examples include X-code, ZZS-code and Park-code [8, 11, 12]. In all the cases mentioned above, the problem dealt with is inventing an error correcting block code capable of correcting up to two erasures (we assume that it is always known which disks have failed). In the present paper we describe a general approach to the solution of this problem, which allows one to develop an optimal RAID-6 scheme for given technological constraints (e.g. available hardware, the number of disks in the array, the required read and write performance). We also consider an assembly implementation of an exemplary RAID-6 system built using our method and show that it outperforms the Linux kernel implementation of RS-based RAID-6. The paper is organised as follows. In Section 2 we discuss RAID-6 in the context of systematic linear block codes and construct simple examples of codes capable of correcting two errors in known positions. In Section 3 we identify an algebraic structure (cone) common to all such codes and use it to construct RAID-6 schemes starting with elements of a cyclic group of a prime order. Section 3.3 is of particular interest to number theorists where we discuss a new condition on the prime numbers arising in the context of RAID-6 schemes. In Section 4 we compare encoding and decoding performances of an assembly implementation of RAID-6 based on Z17 with its RS-based counterpart implemented as a part of Linux kernel. Let us comment on the relation of the presented material to other modern research efforts. Section 2 is rather standard [3]. All original theoretical material of this paper is in Section 3. The notion of a cone is somewhat related to a non-singular difference set of Blaum and Roth [3] but there are essential differences between them. The cone from a cyclic group of prime order as in Lemma 3.4.1 gives EVENODD code [2]. Its extended versions and connections to number theoretic conditions are new. 2. RAID-6 from the viewpoint of linear block codes. Suppose that information to be written on the array of disks is broken into words of length n bits. What is the best rate linear block code, which can protect data against the loss of two words? Altogether, there are 22n possible pairs of words. In order to distinguish between them, one needs at least 2n distinct syndromes. Therefore, any linear block code capable of restoring 2 lost words in known locations must have at least 2n parity checks. Suppose the size of the information block is N n bits or N words. In the context of RAID, N is the number of information disks to be protected against the failure. Then the code’s block size must be at least (2 + N )n and the rate is R≤

N N +2

This result is intuitively clear: to protect N information disks against double failure, we need at least 2 parity disks. Note however, that in order to achieve this optimal rate, the word length n must grow with the number of disks N . Really, the size of parity check matrix is 2n × (N + 2)n. All columns of the matrix must be distinct and nonzero. Therefore, (N + 2)n ≤ (22n − 1), i. e.   1 2n 2 − 1 − 2n ≥ N n If in particular n = 1, then N ≤ 1. Therefore, if one wants to protect information written on the disks against double failures using just two parity disks, the word size n ≥ 2 is necessary. If n = 2, we get N ≤ 5. In reality, the lower bound on n (or, equivalently, the upper bound on N ) is more severe, as the condition that all columns of parity check matrix are distinct leads to a code with minimal distance dmin = 2. However, in order to build a code which corrects up to 2n errors in the known location we need dmin = 2n. In the following subsections we will construct explicit examples of linear codes for RAID-6 for small values of n and N . These examples both guide and illustrate our general construction of RAID-6 codes presented in Section 3. 2.1. Redundant array of four independent disks, which protects against the failure of any two disks. We restrict our attention to systematic linear block codes. These are determined by the parity matrix. To preserve a backward compatibility with RAID-5 schemes, we require half of the parity bits to be 2

the straight XOR of the information bits. Hence the general form of the parity check matrix for N = 2 is   In×n In×n In×n 0n×n P = (1) H G 0n×n In×n , where In×n and 0n×n are n × n identity and zero matrix correspondingly; G and H are some n × n binary matrices. The corresponding parity check equations are (2)

d1 + d2 + π1 = 0

and

H · d1 + G · d2 + π2 = 0.

Here d1 , d2 are n-bit words written on disks 1 and 2, π1 and π2 are n-bit parity check words written on disks 3 and 4; ”·” stands for binary matrix multiplication. Matrices G and H defining the code are constrained by the condition that the system of parity check equations must have a unique solution with respect to any pair of variables. To determine these constraints we need to consider the following particular cases. (π1 , π2 ) are lost. The system (2) always has a unique solution with respect to lost variables: we can compute parity bits in terms of information bits. (d1 , π2 ) are lost. The system (2) always has a unique solution with respect to lost variables: compute d1 in terms of π1 and d2 using the first equation of (2) as in RAID-5. Then compute π2 using the second equation. (d2 , π2 ) are lost. The system (2) always has a unique solution with respect to lost variables: compute d2 using π1 and d1 as in RAID-5. Then compute π1 using (2). (π1 , d1 ) are lost. The system (2) always has a unique solution with respect to lost variables provided the matrix H is invertible. (π1 , d2 ) are lost. The system (2) always has a unique solution with respect to lost variables provided the matrix G is invertible. (d1 , d2 ) are lost. The system (2) always has a unique solution with respect to lost variables provided the  In×n In×n is invertible. matrix Hn×n Gn×n As it turns out, one can build a parity check matrix satisfying all the non-degeneracy requirements listed above for n = 2. The simplest choice is   0 1 H = I2×2 , G = (3) . 1 1 Non-degeneracy of the three matrices H, G and (2.1) is evident. For instance1,   In×n In×n = −1 = 1. det Hn×n Gn×n We conclude that the linear block code with a 4 × 8 parity check matrix (1, 3) gives rise to RAID-6 consisting of four disks. The computation of parity dibits π1 , π2 in the described DP RAID is almost as simple as the computation of regular parity bits: Let d1 = (d11 , d12 ) and d2 = (d21 , d22 ) be the dibits to be written on disks one and two correspondingly. Then (4)

π11 = d11 + d21 , π21 = d11 + d22 ,

π12 = d12 + d22 , π22 = d12 + d21 + d22 .

The computations involved in the recovery of lost data are bitwise XOR only. As an illustration, let us write down expressions for lost data bits in terms of parity bits explicitly: d22 = π11 + π12 + π21 + π22 , d11 = π11 + π12 + π22 ,

d12 = π11 + π21 + π22 , d21 = π12 + π22 .

It is interesting to note that RAID-6 code described here is equivalent to Network Appliances’ horizontaldiagonal parity RAID-DPT M with two data disks [7]. Really, diagonal-horizontal parity system for two info disks is A B HP DP 1 C D HP 2 DP 2, 1The reader is aware that −1 = 1 6= 0 in characteristic 2 3

where strings (A, C) are written on information disk 1, strings (B, D) are written on disk 2, (HP, HP 2) is horizontal parity, (DP 1, DP 2) is diagonal parity. By definition, HP = A+ B, HP 2 = C + D, DP 1 = A+ D, DP 2 = B + C + D, which coincides with parity check equations (4). On the other hand, the code (1) is a reduction of the RS code based on GF (4) which we will describe in the next subsection. 2.2. Redundant array of five independent disks, which protects against the failure of any two disks. The code (1) can be extended to a scheme providing double protection of user data written on three disks [3, Example 1.1]. The parity check matrix is   I2×2 I2×2 I2×2 I2×2 02×2 P = (5) I2×2 G G2 02×2 I2×2 , where 2 × 2 matrix G was defined in the previous subsection. The corresponding parity check equations are (6)

d1 + G · d2 + G2 · d3 + π2 = 0.

d1 + d2 + d3 + π1 = 0,

The solubility of these equations with respect to any pair of variables from the set {d1 , d2 , d3 , π1 , π2 } requires two extra conditions of non-degeneracy in addition conditions listed in the pre  to non-degeneracy  I2×2 I2×2 I2×2 I2×2 vious subsection. Namely, matrices and must be invertible. It is possible I2×2 G2 G G2 to check the invertibility of these matrices via a direct computation. However, in the next section we will construct a generalisation of the above example and find an elegant way of proving non-degeneracy. The code (1) is a reduction of (5) corresponding to d3 = 0. Note also that the code (5) is equivalent to rate-3/5 Reed-Solomon code based on GF (4): a direct check shows that the set of 2 × 2 matrices 0, 1, G, G2 is closed under multiplication and addition and all non-zero matrices are invertible. Thus this set forms a field isomorphic to GF (4). On the other hand, as we established in the previous subsection, the code (1) is equivalent to RAID-DPT M with four disks. Therefore, RAID-DPT M with four disks is a particular case of the RS-based RAID-6. It would be interesting to see if RAID-DPT M can be reduced to the RS-based RAID-6 in general. We are now ready to formulate general properties of linear block codes suitable for RAID-6 and construct a new class of such codes. 3. RAID-6 based on the cyclic group of a prime order. 3.1. RAID-6 and cones of GLn (F). In this subsection we will define a general mathematical object underlying all existing algebraic RAID-6 schemes. We recall that F = GF (2) is the field of two elements and GLn (F) is the set of n × n invertible matrices. Definition 3.1.1. A cone2 C is a subset of GLn (F) such that g + h ∈ GLn (F) for all g 6= h ∈ C. This notion is related to non-singular difference sets of Blaum and Roth [3]. The cone satisfies the axioms P1 and P2 of Blaum and Roth but the final axiom P3 or P3’ is too restrictive for our ends. On the other hand, we consider only binary codes while Blaum and Roth consider codes over any finite field. A standard example of a cone appears in the context of Galois fields. If we choose a basis of GF (2n ) as a vector space over F then we can think of GLn (F) as the group of all F-linear transformations of GF (2n ). Multiplications by non-zero elements of GF (2n ) form a cone. If α ∈ GF (2m ) is a primitive generator and g is the matrix of multiplication by α then this cone is {g m | 0 ≤ m ≤ 2n − 2}. This cone gives the RS-code with two parity words. The usefulness of cones for RAID-6 is explained by the following Lemma 3.1.2. Let C = {g1 , g2 , . . . gN } ⊆ GLn (F) be a cone of N elements. Then the system of parity equations (7)

dN +1 =

N X

dk

and

dN +2 =

k=1

N X

gk dk

k=1

2 This terminology is slightly questionable. If one asks g + h ∈ C then C ∪ {0} is a convex cone in the usual mathematical sense. Our choice of the term is influenced by this analogy. Non-singular difference set or quasicone or RAID-cone could be more appropriate scientifically but would pay a heavy linguistic toll. 4

has a unique solution with respect to any pair of variables (di , dj ) ∈ Fn × Fn , 1 ≤ i < j ≤ N + 2. Here di are binary n-dimensional vectors. Proof. The fact that system (7) has a unique solution with respect to (dN +1 , dN +2 ) is obvious. The system has a unique solution with respect to (dN +1 , dj ) for any j ≤ N : from the second of equations P (7), dj = gj−1 (dN +2 + N k6=j gk dk ), where we used invertibility of gj ∈ GLn (F). With dj known, dN +1 can be computed from the first of equations (7). The system has a unique solution with respect to (dN +2 , dj ) for any j ≤ N : from the first of equations P (7), dj = dN +1 + N k6=j dk . With dj known, dN +2 can be computed from the second of equations (7). The system has a unique solution with respect to any pair of variables di , dj for 1 ≤ i < j ≤ N : multiplying the first of equations (7) with gi and adding the first and second equations, we get dj = PN (gi + gj )−1 (gi dN +1 + dN +2 + k6=i,j (gk + gi )dk ). Here we used the invertibility of the sum gi + gj for any i 6= j, which follows from the definition of the cone. With dj known, di can be determined from any of the equations (7). QED In the context of RAID-6, di for 1 ≤ i ≤ N can be thought of as n-bit strings of user data, dN +1 , dN +2 - as n-bit parity strings. The lemma proved above ensures that any two strings can be restored from the remaining N strings. We conclude that any cone can be used to build RAID-6. The following lemma gives some necessary conditions for a cone. Lemma 3.1.3. Let C ⊂ GLn (F) be a cone. (i) For all g, h ∈ C such that g 6= h and for all x ∈ GF (2m )n , gx = hx if and only if x = 0. (ii) No two elements of the same cone can share an eigenvector in Fn . (iii) The cone C can contain no more than one permutation matrix. Proof. To prove (i), assume that there is x 6= 0 : gx = hx. Then (g + h)x = 0, which contradicts the fact that g + h is non-degenerate. Therefore, x = 0. Let us prove (ii) now. As elements of C are non-degenerate, the only possible eigenvalue in F is 1, thus for any two elements sharing an eigenvector x, x = hx = gx, which again would imply degeneracy of h + g unless x = 0. The statement (iii) follows from (ii) if one notices that any two permutation matrices share an eigenvector whose components are all equal to one. QED The notion of the cone is convenient for restating well understood conditions for a linear block code to be capable of recovering up to two lost words. Our main challenge is to find examples of cones with sufficiently many elements, which lead to easily implementable RAID-6 systems. We will now construct a class of cones starting with elements of a cyclic subgroup of GLn (F) of a prime order. 3.2. RAID-6 based on matrix generators of ZN . We start with the following Theorem 3.2.1. Let N be an odd number. Let g be an n × n binary matrix such that g N = Id and Id + g m is non-degenerate for each proper3 divisor m of N . Then the elements of cyclic group ZN = {Id, g, g 2 , . . . , g N −1 } form a cone. The proof of the Theorem 3.2.1 is based on the following two lemmas. Lemma 3.2.2. Let g be a binary matrix such that Id + g is non-degenerate and g N = Id, where N is an integer. Then N −1 X

(8)

gk = 0

k=0

Proof. Let us multiply the left hand side of (8) with (Id + g) and simplify the result using that h + h = 0 for any binary matrix: (Id + g)

N −1 X

g k = Id + g + g + g 2 + . . . + g N −1 + g N −1 + g N = Id + g N = Id + Id = 0.

k=0

PN −1 As Id + g is non-degenerate, this implies that k=0 g k = 0. QED Lemma 3.2.2 is a counterpart of a well-known fact from complex analysis that roots of unity add to zero. 3a natural number m < N that divides N 5

Lemma 3.2.3. Let g be a binary matrix such that g N = Id for an odd number N and Id + g m is nondegenerate for every proper divisor m of N . Then the matrix g l + g k is non-degenerate for any k, l : 0 ≤ k < l < N. Proof. As g N = Id, the matrix g is invertible. To prove the lemma, it is therefore sufficient to check the non-degeneracy of Id + g k for 0 < k < N . The group ZN = {1, g, g 2, . . . , g N −1 } is cyclic. An element gk = g k for 0 < k < N generates the cyclic subgroup ZN/d where d is the the greatest common divisor of N and k and the element g d generates the same subgroup. Since the matrix g d satisfies all the conditions of Lemma 3.2.2, the sum of all elements of ZN/d is zero. Therefore, N d

(9)

0=

−1 X

gkm = (Id + gk ) + gk2 (Id + gk ) + . . . + gkN −3 (Id + gk ) + gkN −1 = 0.

m=0

The grouping of terms used in (9) is possible as N/d is odd. Assume that matrix 1 + gk is degenerate. Then there exists a non-zero binary vector x such that (1 + gk )x = 0. Applying both sides of (9) to x we get gkN −1 x = g k(N −1) x = 0. This contradicts non-degeneracy of g. Thus the non-degeneracy of 1 + g k is proved for all 0 < k < N . QED The proof of Theorem 3.2.1. The matrix g described in the statement of the theorem satisfies all requirements of Lemma 3.2.3. The statement of the theorem follows from Definition 3.1.1 of the cone. QED Theorem 3.2.1 allows one to determine whether elements of ZN belong to the same cone by verifying a single non-degeneracy conditions imposed on the generator. The following corollary of Theorem 3.2.1 makes an explicit link between the constructed cone and RAID-6: Corollary 3.2.4. Let g be an n × n binary matrix such that g N = Id for an odd number N and Id + g m is non-degenerate for every proper divisor m of N . The systematic linear block code defined by the parity check matrix   In×n In×n In×n . . . I I 0 P = In×n g g2 . . . g N −1 0 In×n , can recover up to 2 n-bit lost words in known positions. Equivalently, the system of the parity check equations d1 + d2 + . . . + dN + dN +1 = 0 (10)

d1 + gd2 + . . . + g N −1 dN + dN +2 = 0

has a unique solution with respect to any pair of variables (di , dj ), 1 ≤ i < j ≤ N + 2. Proof. It follows from Theorem 3.2.1. that the first N powers of g belong to a cone. The statement of the corollary is an immediate consequence of Lemma 3.1.2 for gk = g k−1 , 1 ≤ k ≤ N . QED. As a simple application of Theorem 3.2.1, let us show  thatthe parity check matrix (5) does indeed satisfy 0 1 all non-degeneracy requirements. The matrix G = is non degenerate and has order 3. Also, the 1 1   1 1 matrix Id + G = is non-degenerate. Hence in virtue of Corollary 3.2.4, the parity check matrix 1 0 (5) determines a RAID system consisting of five disks, that protects against the failure of any two disks. 3.3. Extension of ZN -based cones for certain primes. We will now show that for certain primes, the cone constructed in the previous subsection can be extended. The existence of such extensions give some curious conditions on a prime number, one of which is new to the best of our knowledge. We start with the following Lemma 3.3.1. Let N > 2 be a prime number. Then the group ring R = FZN of the cyclic group of order b k where F b = GF (2d ), d is the smallest positive integer such that 2d = 1 mod N , N is isomorphic to F ⊕ F k = (N − 1)/d. Proof. By the Chinese Remainder Theorem, R ∼ = ⊕kj=0 F[X]/(fj ) where X N − 1 = f0 · f1 · · · fk is the decomposition into irreducible over F polynomials and f0 = X − 1. Let α be a root of fj for some j > 0. d Then d = deg fj is the smallest number such that α ∈ GF (2d ) ∼ = F[X]/(fj ). Hence, α2 −1 = 1 and d is the smallest with such property. As N is prime, α is a primitive N -th root of unity. Hence, N divides 2d − 1 and d is the smallest with such property. 6

It follows that for j > 0 all fj have degree d and all F[X]/(fj ) are isomorphic to GF (2d ). QED. One case of particular interest is k = 1 which happens when 2 is a primitive N − 1-th root of unity modulo N . This forces k = 1 and d = N − 1. Such primes in the first hundred are 3, 5, 11, 13, 19, 29, 37, 53, 59, 61, 67, 83 [15]. If a generalised Riemann’s hypothesis holds true then there are infinitely many such primes [5]. Lemma 3.3.2. Let N > 2 be a prime number such that 2 is a primitive N − 1-th root of unity modulo N . Let g be an n × n binary matrix such that Id + g is non-degenerate and g N = Id. Then the set of 2N −1 − 1 matrices S = {g a1 + g a2 + . . . + g at | 0 ≤ a1 < a2 . . . < at < N, 1 ≤ t ≤ N 2−1 is a cone.} P P Proof. Matrix g defines a ring homomorphism φ : R → Mn (F), φ( k αk X k ) = k αk g k from the group ring to a matrix ring. Since 1 + g is invertible, 1 + g + g 2 + . . . + g N −1 = (1 + g N )(1 + g)−1 = 0 b the image and 1 + X + . . . + X N −1 lies in the kernel of φ. Since R/(1 + X + . . . + X N −1 ) ∼ = F[X]/(f1 ) ∼ = F, N −1 φ(R) is a field and S is a subset of φ(R). Finally, as f1 = 1 + X + . . . + X is the minimal polynomial of g, all elements g a1 + g a2 + . . . + g at listed above are distinct and nonzero and S = φ(R) \ {0}. QED. Notice that for N = 3, the set S consists only of Id and g. The cone S in Lemma 3.3.2. may be difficult to use in a real system but it contains a very convenient subcone as soon as N > 3. This subcone consists of elements g i and Id + g j . The following theorem gives a condition on the prime p for these elements to form a cone. This condition is new to the best of our knowledge. Theorem 3.3.3. The following conditions are equivalent for a prime number N > 2. (1) For any n × n binary matrix g such that Id + g is non-degenerate and g N = Id the set of 2N − 1 matrices S = {Id, g, g 2 , . . . , g N −1 , Id + g, Id + g 2 , . . . , Id + g N −1 } is a cone. (2) For no primitive N -th root of unity α in the algebraic closure of F, the element α + 1 is an N -th root of unity. (3) For any 0 < m < N the polynomials X N + 1 and X m + X + 1 are relatively prime. (4) No primitive N -th root of unity α in the algebraic closure of F satisfies αm + αl + 1 = 0 with N > m > l > 0. Proof. First, we observe that (1) is equivalent to (4). If (4) fails, there exists an N -th root of unity α such that αm + αl + 1 = 0. Let f (X) be the minimal polynomial of α. The matrix g of multiplication by the coset of X in F[X]/(f ) fails condition (1) with g m + g l + 1 = 0. If (4) holds and g is a matrix as in (1) then the elements of S are all invertible matrices by Theorem 3.2.1. Moreover, it only remains to establish that each matrix g m + g l + Id, N > m > l > 0 is invertible. Suppose that it is not invertible. It must have an eigenvector v ∈ Fn with the zero eigenvalue. It follows that fv (X), the minimal polynomial of g with respect to v, divides both X N + 1 and X m + X l + 1. Since 1 is not a root of X m + X l + 1, any root α of fv (X) in the algebraic closure of F is a primitive N -th root of unity and satisfies αm + αl + 1 = 0. Equivalence of (4) and (3) is clear: β = αl is also a primitive root, hence condition (4) can be rewritten as no root β satisfies β s + β + 1 = 0 with N > s > 0. Thus, X N + 1 and X m + X + 1 do not have common roots in the algebraic closure of F and must be relatively prime. Equivalence of (3) and (2) comes from rewriting αm + α + 1 = 0 as αm = α + 1 and observing that αm is necessarily a primitive N -th root of unity. QED. This theorem allows us to sort out whether any particular prime N is suitable for extending the cone. Corollary 3.3.4. A Fermat prime N > 3 satisfies the conditions of Theorem 3.3.3. A Mersenne prime fails the conditions of Theorem 3.3.3. Proof. A Fermat prime is of the form N = 2k + 1. Hence, for a primitive N th root of unity α k

k

(α + 1)N = (α + 1)2 (α + 1) = (α2 + 1)(α + 1) = α−1 + α. If this is equal 1, then α2 + α + 1 = 0, forcing N = 3. A Mersenne prime is of the form N = 2k − 1. Hence, k

k

(α + 1)N = (α + 1)2 (α + 1)−1 = (α2 + 1)(α + 1)−1 = (α + 1)(α + 1)−1 = 1. QED. 7

In fact, most of the primes appear to satisfy the conditions of Theorem 3.3.3. In the first 500 primes, the only primes that fail are Mersenne and 73. Samir Siksek has found several more primes that fail but are not Mersenne. These are (in the bracket we state the order of 2 in the multiplicative group of GF (p)) 73 (9) 178481 (23), 262657 (27), 599479 (33), 616318177 (37), 121369 (39), 164511353 (41), 4432676798593 (49), 3203431780337 (59), 145295143558111 (65), 761838257287 (67), 10052678938039 (69), 9361973132609 (73), 581283643249112959 (77). It would be interesting to know whether there are infinitely many primes failing the conditions of Theorem 3.3.3. Utilising the cone in Theorem 3.3.3., we start with a matrix generator of the cyclic group of an appropriate prime order N to build a RAID-6 system protecting up to 2N − 1 information disks. The explicit expression for Q-parity is (11)

Q=

N −1 X

g k dk +

k=0

2N −2 X

(Id + g k−N +1 )dk ,

k=N

where d0 , d1 , . . . , d2N −2 are information words. 3.4. Specific examples of matrix generators of ZN and the corresponding RAID-6 systems. Now we are ready to construct explicit examples of RAID-6 based on the theory of cones developed in the above subsections. The non-extended code, based on the Sylvester matrix, is known as EVENODD code [2]. Lemma 3.4.1. Let SN be the (N − 1) × (N − 1) Sylvester matrix,   0 0 0 · · · 1  1 0 0 0 · · 1     0 1 0 0 0 · 1   . SN =    · · · · · · ·   0 0 0 · 1 0 1  0 0 0 · · 1 1 Then (i) SN has order N . (ii) Matrix Id + SN is non-degenerate if N is odd and is degenerate if N is even. Proof. (i) An explicit computation shows, that for any (N − 1)-dimensional binary vector x and for any 1 ≤ k ≤ N,       xN −1 xk xk−1  xN −2   xk   xk−2         xN −3   xk    ·             · k   =  ·  +  x1  . SN (12)       ·    ·   0        ·    ·   xN −1        · · · x1 xk xk+1 k In the above formula xj ≡ 0, unless 1 ≤ j ≤ (N − 1). Therefore, SN 6= Id, for any 1 ≤ k ≤ N − 1. Setting N N k = N in the above formula, we get SN x = x for any x, which implies that SN = Id. Therefore, the order of the matrix SN is N . P −1 k (ii) The characteristic polynomial of SN is f (x) = N k=0 x . (In order to prove this it is sufficient to notice that the matrix SN is the companion matrix of the polynomial f (x) [6]. As such, f (x) is both the characteristic and the minimal polynomial of the matrix SN .) Therefore,

f (SN ) =

N −1 X

k SN = 0.

k=0

Notice that the matrix SN is non-degenerate as it has a positive order. If N is odd, we can re-write the characteristic polynomial as (N −3)

3 f (SN ) = (Id + SN )(1 + SN + SN + . . . + SN 8

N −1 ) + SN

Therefore, the degeneracy of Id + SN will contradict the non-degeneracy of SN . If N is even, the sum of all rows of Id + SN is zero, which implies degeneracy. QED Lemma 3.4.1 states that the matrix SN generates the cyclic group ZN and that the matrix Id + SN is non-degenerate for any odd N . Given that N is an odd prime, Corollary 3.2.4 implies that using parity equations (10) with g = SN , it is possible to protect N data disks against the failure of any two disks. Furthermore, if N > 3 is a Fermat prime or 2 is a primitive root modulo N , 2N − 1 data disks can be protected against double failure thanks to the results of section 3.1. We will refer to the RAID-6 system based on Sylvester matrix SN as ZN -RAID. Let us give several examples of such systems. (1) Z3 -RAID has been considered in subsections 2.1, 2.2. It can protect up to 3 information disks against double failure. As N = 3, protection of 5 information disks using extended Q-parity (11) is impossible. (2) Using Z17 -RAID, one can protect up to N = 17 disks using Q-parity (10) and up to 2N − 1 = 33 disks using extended Q-parity (11). (3) Using Z257 -RAID, one can protect up to N = 257 disks using Q-parity (10) and up to 2N − 1 = 513 disks using extended Q-parity (11). It can be seen from (12), that the multiplication of data vectors with any power of the Sylvester matrix SN requires one left and one right shift, one n-bit XOR and one AND only. Thus the operations of updating Q-parity and recovering data within ZN -RAID does not require any special instructions, such as Galois field look-up tables for logarithms and products. As a result, the implementation of ZN -RAID can in some cases be more efficient and quick than the implementation of the more conventional Reed-Solomon based RAID-6. In the next section we will demonstrate the advantage of ZN -RAID using an example of Linux kernel implementation of Z17 -RAID system. 4. Linux Kernel ZN -RAID Implementation 4.1. Syndrome Calculation for the Reed-Solomon RAID-6. First, let us briefly recall the RAID-6 e see [1] for more details. Let D0 , . . . , DN −1 be the scheme based on Reed-Solomon code in the Galois field F, bytes of data from N information disks. Then the parity bytes P and Q are computed as follows, using 4 e g = {02} ∈ F: (13)

P = D0 + D1 + . . . + DN −1 , Q = D0 + gD1 + . . . + g N −1 DN −1 .

The multiplication by g = {02} can be viewed as the following matrix multiplication.        y0 0 0 0 0 0 0 0 1 x0 x7 y1  1 0 0 0 0 0 0 0 x1   x0         y2  0 1 0 0 0 0 0 1 x2  x1 ⊕ x7         y3  0 0 1 0 0 0 0 1 x3  x2 ⊕ x7          (14) y4  = 0 0 0 1 0 0 0 1 x4  = x3 ⊕ x7         y5  0 0 0 0 1 0 0 0 x5   x4         y6  0 0 0 0 0 1 0 0 x6   x5  y7 0 0 0 0 0 0 1 0 x7 x6 Given (14), parity equations (13) become similar to (10). Indeed, the element g generates a cyclic group, so a 2-error correcting Reed-Solomon code is a partial case of a cone based RAID. However, ZN -RAID has several advantages. For instance, using Sylvester matrices one can achieve a simpler implementation of matrix multiplication. 4 Algebraically, we use the standard representation in electronics: F e = GF (2)[x]/I where the ideal I is generated by x8 + x4 + x3 + x2 + 1 and g = x + I 9

4.2. Linux Kernel Implementation of Syndrome Calculation. To compute the Q-parity, we rewrite (13) as Q = D0 + g(D1 + g(. . . + g(DN −2 + gDN −1 ) . . .)))

(15)

which requires (N − 1) multiplications by g = {02}. The product y of a single byte x and g = {02} can be implemented as follows. uint8 t x, y;

y = (x

Suggest Documents