EVENODD: An Optimal Scheme for Tolerating ... - EECS at UC Berkeley

2 downloads 0 Views 872KB Size Report
At the encoding, if bo,bl,..., bm_l is a string of information bytes, according to (11) ..... 1987. M. Blaum,. J. Brady,. J. Bruck and J. Menon,. “EVENODD: An Optimal.
EVENODD:

An

Optimal

Scheme

Failures Mario

Blaurn*

in RAID Jim

Bradyt

*IBM

Alrnaden

for

Tolerating

Double

Disk

Architectures Jehoshua

Bruck*

Jai

Menon*

Research Center

San Jose, CA 95120 {blaurn,bruck,menonjm}

(%dmaden.ibm.com

tIBM ssJj San Jose, CA 95193

Abstract We present ODD,

for

a novel

tolerating

architectures. for

EVENODD

tolerating

regani

to both storage the addition

consists

of simple

advantage

parity

that

exclusive-OR which

is typically

dard RAID-5

controllers. on standani

Hence,

controllers

The only optimal

two extra disks) is based on Reed-Solomon corncting

codes, requires computation

and results example,

in a more

we show

operations

involved

disk array

with

required

that

complex

implementation.

the number

in implementing

is complete.

First,

However, factors

growth

tional

(i.e.

accelerate,

loss of no more than improving

one disk will

For

quately

disk MTBF compensate

in a

15 disks is about 50% of the number

As a result,

when using the RS scheme.

Disk Arrays

or decreasing

of interest

and in attempting

Introduction Disk arrays

5 disk arrays, signing highly

[16], in particular

number

RAID-3

the exclusive-OR

of disks is maintained

of data

from

on a redundant

[4, 5, 8, 13].

correcting

and RAID-

ity than

have become an accepted way for deavailable and reliable disk subsystems.

In such arrays, When

taneously

codes simple

terminology,

higher

disk.

diska fail simul-

(in coding

capabiltheory

whose lc~cation is

Theoretically,

in order

to retrieve

(erased)

disks, we need at least two

data loss (MTTDL)

for recovering

is proportional

245 @ 1994 IEEE

in Large

lost in two failed

redundant disks (in coding the Singleton bound [11]).

$03.00

of

it will not.

correcting

is an error

by exclusive-ORing the data on the surviving diska, and writing this into a spare disk. The mean time to

1063-6897/94

that

known).

some

a disk fails, the data on it can be reconstructed

of such a system

in the number

the use of erasure-

is suggested

an erasure

can ade-

to design systems that

For this,

[8] with parity

whether

MTTR

has arisen

will not lose data even when multiple

1

tradi-

prove to be inade-

and concludes

a lot

that

[6] explores

for the increase

disks per installation,

videa~ and fax.

from the simultaneous

quate by the year 2000 [6]. Also,

fields

for data are

and bIy the in-

it was shown

arrays which can protect

because

are Ibecoming

requirements

caused by normal

of

the average

is growing

disk form

Second, installation

Such

when the number

of disks in an installation

As these trends

known

of exclusive-OR EVENODD

are lost if a second

crease in new forms of data like audio,

(RS) error-

over finite

is small.

increasing,

can be

storage

disks in the subsystem

smaller.

without

previously

redundant

MTTDL

number

in stan-

EVENODD

RAID-5

changes.

scheme that employes

present

disk [16]. Data

have acceptable

of two reasons.

requires

failures

of diska and the mean time to reconstruct

the failed

arrays

A ma-

it only

proportional

disk fails before the reconstruction

with

disks and

computations. is that

to the square of

and inversely

(MTTR)

scheme

EVENODD

of only two redundant

implemented any hardware

known

between

(MTBF)

the number

in RAID

that is optimal

and performance.

of EVENODD

hardware,

we call EVEN-

is the first

double disk failures

employs jor

method,

up to two disk failures

to the square of the disk mean time

the

information

the information

theory, this is known as A natural scheme, then, lost

in two

disks,

is

using

the so called

Reed-Solomon

codes [11].

ever, Reed-Solomon

codes involve

nite fields.

be desirable

It would

exclusive-OR parity.

operations

This

capabllit

operations

in [17], although

drawback

error propagation. volutional

type,

Moreover, there

the end of the data.

tion

there

corresponding

this code

For higher

redundancy

correcting

error in such applications. as follows:

the encoding scheme.

decoding

after the failure

procedure

of one or two disks.

capability,

by comparing

Reed-Solomon

codes.

In Section

mance issues of our scheme.

Therefore,

present

the problem

exclusive-OR

still

operations

is finding

codes based on

and of block

type.

in [1, 4, 9, 10] and later generfllzed

in [5] for multiple

erssures.

slthough

very

tations

simple,

(which

In this paper,

we present

procedure

that

and does not

We also present EVENODD single

parity

twice

recursive

as many simple

exclusive-OR

parity

scheme;

computations.

requires

operations

of

a prime

to

array quired

with

of exclusive

OR oper-

EVENODD

in a disk

15 disks is about

50% of the number

to having

an optimal

procedure,

re

number

of operahas the

operations

are very efiicient.

a disk sector is modified, wiU need to be modified here that correcting dancy

EVENODD

algorithms.

corresponds

Hence,

efficient

arbitrary

capacity

symbols

separately.

an Abelian

EVENODD

m -1

magnetic

over, the decoding

recording algorithm

on

can handle

by assuming

that

(all the informa-

the presentation,

we will

(in

fact,

group).

in terms of the redun-

of m – 1

in some of our

each symbol

is a bit.

to assume that

the sym-

it can be shown

A practical

that

our

are elements

implementation

in

is to

as an 8-bit byte, and to assume that (half

a sector).

Notice

that

257

number.

Baaed on the assumptions tolerating

of

works for disks with each block

assume that

it is not necessary

we assume

m – 1 symbols

For simplicity,

= 256 symbols

is a prime

two disk

failures

above,

the problem

can be described

of

as fol-

lows:

and decoding

can be used in other

Problem

applications where there is a need of correcting two erased symbols with low complexity; for example, in multi-track

m is

has no effect

no information

by treating

consider a symbol

when

to a new 2-erasure

encoding

that

EVENODD

of disks simply

to simplify

bols are bhmry

only two other disk sectors at the same time. We note

code which is optimal

and has very

In particular,

requirement

scheme works even when the symbols

advantages that the encoding procedure can be implemented using existing parity hardware and that small write

of RAID-5

We assume

on it. Our procedure

examples,

EVENODD

of

but it is easy to

each of the m disks has only

However, In addition

number

In order

when using the RS scheme.

tions for the encoding

This

is, our

tion bits are O).

we

codes; for example,

the number

That

is an extension

is dedicated),

of EVENODD.

are disks with

information

in implementing

number.

an arbitrary there

parity

among

effects when re-

are performed. which

the

It is pos-

the redundancy

is distributed).

the optimaMy

about

since

operations

with

m disks while

in the last two disks.

is of a scheme

parity

that

involved

7, we

axe m + 2 disks

how it can be made an extension

(where

have two redundant disks. EVENODD is substantially more efficient when compared to Reed-Solomon ations

there

in the first

to distribute

(where

imagine

We

compared

is optimal

write

RAID-4

of implementation

this

is stored

description

oper-

procedure.

EVENODD

in Section

remarks.

that

stored

however,

peated

en-

it to that of the traditional

codes.

of traditional

6 we discuss perfor-

Finally,

all disks in order to avoid bottleneck

scheme aa well as to the scheme based

on Reed-Solomon the

sible,

operations.

the complexity

and compared

redundancy

at the encoding

decoding

assume

the information

to implement

a novel and efficient

involve

In

Encoding We will

compu-

is based on exclusivc+OR

a simple

have calculated

recursive

hardware)

small write

2

those solutions,

and hard

exclusive-OR

process and during

ations

involve

are inefficient

using existing

coding

still

operations.

A solu-

tion was obtained

However,

some concluding

the

be used

of implementation

it to that

the codes in [7, 14, 15] have the same dhmchantages.

will

In Section 4 we ad-

of small write

of EVENODD

used by our

3 we present

which

Section 5 we address the complexity

at

in the next sec-

procedure

In Section

dress the implementation

is an infinite

since the code is of con-

is an overhead

we describe

new EVENODD

when the error correcting

y of the code is broken,

one random

The paper is organized

over fi-

to have codes doing

only, as in the case of simple

was achieved

has the following

correct

How-

array,

Consider

number,

the (m – 1) x (m+

such that

symbol

aij,

2) O~

i