Performance of Two Disk Failure Tolerant Disk Arrays - NJIT

22 downloads 0 Views 633KB Size Report
higher reliability. → higher redundancy (and cost). → more blocks to be updated for a write request. → higher overhead for writes. → lower performance in ...
Performance of Two Disk Failure Tolerant Disk Arrays Chunqi Han Computer Science Department New Jersey Institute of Technology -NJIT Newark, NJ 07102, USA

1

Background    

Online disk storage cheaper and faster than paper Magnetic disks dominant storage medium Users adding nearly 100% storage capacity per year Downtime can be very expensive: hourly cost for     



Retail brokerage Credit card authorizer Pay-per-view media Home shopping Airline reservations

$6,500,500 $2,600,000 $1,150,000 $113,000 $89,500

Increase data availability by introducing redundancy 2

RAID5 -- Single disk failure tolerant array

   

Rotated block interleaved parity (Left-Symmetric) P0-4 = D0 ⊕ D1 ⊕ D2 ⊕ D3 ⊕ D4 (definition) P0-4new = D1new ⊕ D1old ⊕ P0-4old (update, 4 accesses) D0 = D1 ⊕ D2 ⊕ D3 ⊕ D4 ⊕ P0-4 (reconstruct) 3

RAID5 operating modes: 

Normal mode: 

Write/update – requires updating the parity block 



Degraded mode (with one disk failure) 

To access data on the failed disk, read and XOR all the corresponding blocks from surviving disks. 



“small write penalty”.

Doubles disk loads

Rebuild mode

4

Reasons why RAID5 not enough 

RAID5 is vulnerable to data loss after a single disk failure, until its contents are reconstructed, due to:   





Uncorrectable errors Hidden faults on surviving disks A second disk failure before reconstruction is completed

Increasing disk capacity makes problem worse. Solution: two disk failure tolerant arrays 5

Double disk failure tolerant arrays 

RAID6 : Using Reed-Solomon code  



EVENODD : proposed by M.Blaum et al.’95 



StorageTek : Iceberg Hewlett-Packard : RAID5 DP (Double Parity) Using parity only, minimal redundancy

RM2 : proposed by C.I.Park 

Using parity only

6

Performance considerations 

No free lunch:    

higher reliability higher redundancy (and cost) more blocks to be updated for a write request higher overhead for writes lower performance in degraded mode

7

Motivations for the study 

Compare RAID0, RAID5, RAID6, EVENODD and RM2: 



Overhead for tolerating single and double disk failures (number of accessed blocks). Performance in normal, degraded with single or double disk failures (response times).

8

Accomplishments 





 

Device independent cost functions derived for various schemes. Maximum throughputs obtained for a given workload and disk characteristics. A queuing model developed to obtain mean response times. Queuing model validated by simulation results. Performance and scalability of small, intermediate, and large disk array configurations compared.

9

Methodology: Enumerate various cases and weigh them with appropriate frequencies. A sample case breakdown graph (RAID6 with one disk failure) Incoming Requests

RAID6 degraded mode ( 1 failed disk )

fr = fraction of read fw= fraction of write

Read ( fr )

Read from Read from normal failed disk ( (N-1)/N ) disk ( 1/N )

Write ( fw )

Both data and parity are on non-failed disks ( (N-3)/N )

Data on failed disk, parities on non-failed disk ( 1/N )

Data on non-failed disk, one of the two parities on failed disk ( 2/N ) 1 Simple Read

Fork/Join Read of the surviving (N-2) disks (F/J)(N-2)read

3 RMW on data and parity disks RMW + (F/J)2RMW

Reconstruct write: Read the surviving ( N-3 ) data block, then write the 2 parity blocks. (F/J)(N-3)read + (F/J)2write

2 RMW

RAID5 cost of operation

11

RAID6 Overview  

RAID6 uses the Reed-Solomon code Parity layout similar to left symmetric in RAID5 D0 D1 D2 D3 D4 P0-4 Q0-4 D6

D7

D8

D9

P5-9

Q5-9

D5

D12

D13

D14

P10-14

Q10-14

D10

D11

D18

D19

P15-19

Q15-19

D15

D16

D17

D24

P20-24

Q20-24

D20

D21

D22

D23

P25-29

Q25-29

D25

D26

D27

D28

D29

Q30-34

D30

D31

D32

D33

D34

P30-34

12

RAID6 cost of operation

13

EVENODD organization 

Two kind of parities:  

Horizontal parities P (same as in RAID5) Diagonal parities Q (shown below)

a “symbol” “segment” =(m-1) symbols m is a prime number S=⊕

∞ (m+2) disks Figure extracted from M.Blaum et al “EVENODD: An optimal scheme for tolerating double disk failures in RAID architecures” with minor modification

14

RM2 ν

Sample RM with M=3 (33.3% redundancy) and T=7 :

The Redundancy Matrix

Corresponding disk layout

Each data block di,j is protect by parities pi and pj .

15

RM2 cost of operation

16

Queuing model  



Mean response times estimated by M/G/1 model. Accuracy of M/G/1 model verified with a detailed simulation of a single disk, error < 1% Extreme-value distribution used to approximate nway fork-join response time (RF/Jn).  

Simulation results show Rmaxn is a tight upper bound to RF/Jn Matching the first two moments of read response time with extreme-value distribution we get n Rmax = Rr + ( 6 / π )σ r ln(n)

17

Normal mode read response time 300 RAID0 (Analytical) RAID5 (Analytical) RAID6 & RM2 (Analytical) RAID0 (Simulation) RAID5 (Simulation) RAID6 (Simulation) RM2 (Simulation)

250

RAID6,RM2 RAID5

RAID0

Read Resp. Time (ms)

200

150

100

50

0 0

200

400

600

800

1000

1200

1400

1600

Arrival Rate (1/s)4KB blocks, IBM18ES disk Normal mode, 19 disks, 75% read,

18

Read response time with single disk failure 300 RAID5 (Analytical) RAID6 (Analytical) RM2 (Analytical) RAID5 (Simulation) RAID6 (Simulation) RM2 (Simulation)

250

RAID5

RAID6 RM2

Read Resp. Time (ms)

200

150

100

50

0 0

100

200

19 disks, 75% read

300

400

500

600

700

Arrival Rate (1/s)

19

Read response time with two disk failures 300 RAID6 (Analytical)

RM2

RM2 (Analytical)

250

RAID6 (Simulation)

RAID6

Read Resp. Time (ms)

RM2 (Simulation) 200

150

100

50

0 0

50

100

19 disks, 75% read

150

200

250

300

350

400

450

Arrival Rate (1/s)

20

Throughputs in degraded modes with respect to normal mode 





We can find the performance degradation of schemes from the table below:

With single failure RM2 performs better due to declustering effect. With two disk failures RAID6 and EVENODD attain higher maximum throughput. 21

Conclusions 





Performances with one disk failure are similar to RAID5 for all schemes With one and two disk failures, RAID6 and EVENODD achieve 70% and 50%, respectively, of the maximum throughput in normal mode. RM2 can achieve about 90% of the throughput of RAID6 (and EVENODD) with two disk failures. It can be used sparingly to recover from localized faults. 22

Suggest Documents