Distributed Storage Allocation for High Reliability

Distributed Storage Allocation for High Reliability Derek Leong1 1

2

Alex Dimakis2

Tracey Ho1

Department of Electrical Engineering California Institute of Technology Pasadena, California, USA

Department of Electrical Engineering University of Southern California Los Angeles, California, USA

ICC 2010 2010-05-26

Introduction Motivating Example

A Water Analogy

Distributed Storage Allocation for High Reliability

ICC 2010 2010-05-26

Introduction Motivating Example A thirsty lion wanders across the savanna in search of water ...


ICC 2010 2010-05-26

Introduction Motivating Example ... he needs at least 1ℓ of water to survive ...

≥ 1


ICC 2010 2010-05-26

Introduction Motivating Example ... suddenly, he finds five abandoned jerrycans ...

≥ 1


ICC 2010 2010-05-26

Introduction Motivating Example ... he chooses three jerrycans at random and starts chewing them open ...

≥ 1


ICC 2010 2010-05-26


Budget of 1.5ℓ Survival Probability

Allocation A

1

1 2

0

0

0

?

B

1 3

1 3

1 3

1 3

1 6

?

C

1 2

1 2

1 2

0

0

?


ICC 2010 2010-05-26


Allocation A 3-Subset

1

1 2

0

0

0

(i)

1

0

0

0

(ii)

1

1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

(iii)

1

(iv)

1

(v)

1

(vi)

1

(vii)

1

(viii)

1

(ix)

1

(x)

1


P

≥ 1?

" " " " " " % % % %

ICC 2010 2010-05-26


Allocation A 3-Subset

1

1 2

0

0

0

(i)

1

0

0

0

(ii)

1

1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

(iii)

1

(iv)

1

(v)

1

(vi)

1

(vii)

1

(viii)

1

(ix)

1

(x)

1


P

≥ 1?

" " " " " " % % % %

ICC 2010 2010-05-26

60% Survival Probability


Allocation B 3-Subset

1 3

1 3

1 3

1 3

1 6

(i)

1 3 1 3 1 3 1 3 1 3 1 3 1 3 1 3 1 3 1 3

1 3 1 3 1 3 1 3 1 3 1 3 1 3 1 3 1 3 1 3

1 3 1 3 1 3 1 3 1 3 1 3 1 3 1 3 1 3 1 3

1 3 1 3 1 3 1 3 1 3 1 3 1 3 1 3 1 3 1 3

1 6 1 6 1 6 1 6 1 6 1 6 1 6 1 6 1 6 1 6

(ii) (iii) (iv) (v) (vi) (vii) (viii) (ix) (x)


P

≥ 1?

" " % " % % " % % %

ICC 2010 2010-05-26


Allocation B 3-Subset

1 3

1 3

1 3

1 3

1 6

(i)

1 3 1 3 1 3 1 3 1 3 1 3 1 3 1 3 1 3 1 3

1 3 1 3 1 3 1 3 1 3 1 3 1 3 1 3 1 3 1 3

1 3 1 3 1 3 1 3 1 3 1 3 1 3 1 3 1 3 1 3

1 3 1 3 1 3 1 3 1 3 1 3 1 3 1 3 1 3 1 3

1 6 1 6 1 6 1 6 1 6 1 6 1 6 1 6 1 6 1 6



P

≥ 1?

" " % " % % " % % %

ICC 2010 2010-05-26



Allocation C 3-Subset

1 2

1 2

1 2

0

0

(i)

1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2

1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2

1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0



P

≥ 1?

" " " " " % " " % %

ICC 2010 2010-05-26


Allocation C 3-Subset

1 2

1 2

1 2

0

0

(i)

1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2

1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2

1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0



P

≥ 1?

" " " " " % " " % %

ICC 2010 2010-05-26



Budget of 1.5ℓ Survival Probability

Allocation A

1

1 2

0

0

0

60%

B

1 3

1 3

1 3

1 3

1 6

40%

C

1 2

1 2

1 2

0

0

70%


ICC 2010 2010-05-26


Water ≈ Coded Data


ICC 2010 2010-05-26


Water ≈ Coded Data


ICC 2010 2010-05-26


Water ≈ Coded Data storage node 1

x1

t1

storage node 2

r1

...

...

s

x2

storage node n

xn


r2 ICC 2010 2010-05-26

t2

Introduction Key Question

storage node 1

x1

storage node 2

...

x2

...

s

r

t

Given a limited storage budget, how should we store a data object over a set of nodes so that it can be recovered with maximum reliability?

storage node n

xn


ICC 2010 2010-05-26


storage node 1

x1

storage node 2

...

x2

...

s

r

t


storage node n

xn


ICC 2010 2010-05-26


storage node 1

x1

storage node 2

...

x2

...

s

r

t


storage node n

xn


ICC 2010 2010-05-26


storage node 1

x1

storage node 2

...

x2

...

s

r

t


storage node n

xn


ICC 2010 2010-05-26

General Problem Description Storage Allocation

A source has a data object of unit size which it can code and store over a set of n storage nodes Let 1 , 2 , . . . , n be the amount of coded data stored in node 1, 2, . . . , n

storage node 1

x1

storage node 2

...

x2

...

s

storage node n

xn

r

t

Although any amount of data can be stored in each node, the total amount of storage used must not exceed a given budget T , i.e. n X

 ≤ T

=1


ICC 2010 2010-05-26



storage node 1

x1

storage node 2

...

x2

...

s

storage node n

xn

r

t


 ≤ T

=1


ICC 2010 2010-05-26



storage node 1

x1

storage node 2

...

x2

...

s

storage node n

xn

r

t


 ≤ T

=1


ICC 2010 2010-05-26

General Problem Description Access by a Data Collector

A data collector subsequently attempts to recover the original data object by accessing only a random subset r of the nodes, where r is to be specified by the assumed access model or failure model

storage node 1

x1

storage node 2

...

x2

...

s

storage node n

xn

r

t

By using an appropriate code, successful recovery occurs when the total amount of data in the accessed nodes is at least the size of the original data object, i.e. X

 ≥ 1

∈r


ICC 2010 2010-05-26

General Problem Description Access by a Data Collector

A data collector subsequently attempts to recover the original data object by accessing only a random subset r of the nodes, where r is to be specified by the assumed access model or failure model

storage node 1

x1

storage node 2

...

x2

...

s

storage node n

xn

r

t

By using an appropriate code, successful recovery occurs when the total amount of data in the accessed nodes is at least the size of the original data object, i.e. X

 ≥ 1

∈r


ICC 2010 2010-05-26

General Problem Description Objective

storage node 1

x1

storage node 2

...

x2

...

s

r

t

For a given budget T , we seek the optimal allocation {1 , 2 , . . . , n } that maximizes the probability of successful recovery   X P    ≥ 1 ∈r

storage node n

xn

This optimization problem is difficult in general because the objective function is discrete and nonconvex, and there is a large space of feasible allocations to consider


ICC 2010 2010-05-26

General Problem Description Objective

storage node 1

x1

storage node 2

...

x2

...

s

r

t

For a given budget T , we seek the optimal allocation {1 , 2 , . . . , n } that maximizes the probability of successful recovery   X P    ≥ 1 ∈r

storage node n

xn

This optimization problem is difficult in general because the objective function is discrete and nonconvex, and there is a large space of feasible allocations to consider


ICC 2010 2010-05-26

General Problem Description Related Work

Discussion between R. Karp, R. Kleinberg, C. Papadimitriou, E. Friedman, and others at UC Berkeley, 2005 storage node 1

x1

storage node 2

...

x2

...

s

storage node n

xn

r

t

S. Jain, M. Demmer, R. Patra, K. Fall, “Using redundancy to cope with failures in a delay tolerant network,” SIGCOMM 2005 LDH, “Distributed storage allocation problems,” NetCod 2009 M. Sardari, R. Restrepo, F. Fekri, E. Soljanin, “Memory allocation in distributed storage networks,” ISIT 2010


ICC 2010 2010-05-26



x1

storage node 2

...

x2

...

s

storage node n

xn

r

t



ICC 2010 2010-05-26



x1

storage node 2

...

x2

...

s

storage node n

xn

r

t



ICC 2010 2010-05-26



x1

storage node 2

...

x2

...

s

storage node n

xn

r

t



ICC 2010 2010-05-26

Access to a Fixed-Size Subset of Nodes Problem Description Data collector accesses an r -subset of storage nodes, selected uniformly at random from the collection of all possible r -subsets, where r is a constant


ICC 2010 2010-05-26

Access to a Fixed-Size Subset of Nodes Problem Description Maximize recovery probability for a given budget T storage node 1

x1

(n, r, T) :

storage node 2

...

...

s

x2

r

t

X

maximize 1 ,...,n

storage node n

xn

r⊂{1,...,n}: |r|=r

  X  n ·    ≥ 1

1 r

∈r

subject to Data collector accesses an r -subset of storage nodes, selected uniformly at random from the collection of all possible r -subsets, where r is a constant

n X

 ≤ T

=1

 ≥ 0

∀  ∈ {1, . . . , n}

For the trivial budget T = 1, the optimal allocation is {1, 0, . . . , 0}


ICC 2010 2010-05-26

Access to a Fixed-Size Subset of Nodes Problem Description Minimize budget required to achieve a given recovery probability PS storage node 1

x1

0 (n, r, PS ) :

storage node 2

...

x2

...

s

r

storage node n

xn

t

minimize

T

1 ,...,n

subject to

 Data collector accesses an r -subset of storage nodes, selected uniformly at random from the collection of all possible r -subsets, where r is a constant

X



r⊂{1,...,n}: |r|=r


X



n

  ≥ 1 ≥ PS

r

∈r n X

 ≤ T

=1

 ≥ 0

ICC 2010 2010-05-26

∀  ∈ {1, . . . , n}

Access to a Fixed-Size Subset of Nodes Some Numerical Results (n, r) = (5, 3) maximum 1.0 recovery 0.9 probability

max Ps

0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

0.2

0.4

0.6

0.8

1.0

budget T Distributed Storage Allocation for High Reliability

ICC 2010 2010-05-26

1.2

1.4

1.6


max Ps

1 1 1 1 1 {0,0,0,0,0} 3 3 3 3 3

0.8

1 1 1 {0,0,0,0,0} 2 2 2

0.7 0.6

{1,0,0,0,0}

0.5 0.4 0.3 0.2 0.1 0

0.2

0.4

0.6

0.8

1.0


ICC 2010 2010-05-26

1.2

1.4

1.6


max Ps

0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

0.5

1.0

1.5

2.0


ICC 2010 2010-05-26

2.5

3.0


max Ps

1 1 1 1 1 1 {0,0,0,0,0,0} 2 2 2 2 2 2

0.8

1 1 1 1 1 {0,0,0,0,0,0} 2 2 2 2 2

0.7 0.6

{1,1,0,0,0,0}

0.5 0.4 0.3 0.2

{1,0,0,0,0,0}

0.1 0

0.5

1.0

1.5

2.0


ICC 2010 2010-05-26

2.5

3.0

Access to a Fixed-Size Subset of Nodes Special Case of Probability-1 Recovery

storage node 1

x1

Theorem: Probability-1 Recovery

storage node 2

...

x2

...

s

r

t

The allocation

storage node n

xn

Data collector accesses an r -subset of storage nodes, selected uniformly at random from the collection of all possible r -subsets, where r is a constant

 =

1 r

,

 = 1, . . . , n,

minimizes the budget if probability-1 recovery is required.


ICC 2010 2010-05-26

Access to a Fixed-Size Subset of Nodes Regime of High Recovery Probability

storage node 1

Theorem: Case of r | n

storage node 2

Suppose that n is a multiple of r . The allocation

x1

...

...

s

x2

r

t

storage node n

xn


 =

1 r

,

 = 1, . . . , n,

minimizes the budget if and only if the required recovery probability exceeds


r 1−

. n

ICC 2010 2010-05-26

Access to a Fixed-Size Subset of Nodes Regime of High Recovery Probability Theorem: Case of r - n Suppose that n is not a multiple of r . The allocation

storage node 1

x1

storage node 2

...

x2

...

s

r

t

 =

storage node n

xn


1 r

,

 = 1, . . . , n,

minimizes the budget if the required recovery probability exceeds

1−

gcd(r, r 0 ) α gcd(r, r 0 ) + r 0

where n = α r + r 0 , α ∈ Z+ , 0 0 and r ∈ {r + 1, . . . , 2r − 1}.


ICC 2010 2010-05-26

,

Access to a Fixed-Size Subset of Nodes Regime of High Recovery Probability Proof Idea: Consider (n, r) = (4, 2). Begin with the case of probability-1 recovery... storage node 1

x1

storage node 2

...

x2

...

s

r

storage node n

xn


t

0 (n = 4, r = 2, PS = 1) : minimize

T

1 ,2 ,3 ,4

subject to

1 + 2 ≥ 1

2 + 3 ≥ 1

1 + 3 ≥ 1

2 + 4 ≥ 1

1 + 4 ≥ 1

3 + 4 ≥ 1

T ≥ 1 + 2 + 3 + 4 1 ≥ 0 2 ≥ 0 3 ≥ 0 4 ≥ 0


ICC 2010 2010-05-26

Access to a Fixed-Size Subset of Nodes Regime of High Recovery Probability Proof Idea: Consider (n, r) = (4, 2). 1 1 Optimal allocation is 1 = 2 = 3 = 4 = r = 2 , n which gives T = r = 2.

storage node 1

x1

storage node 2

...

x2

...

s

r

storage node n

xn


t

0 (n = 4, r = 2, PS = 1) : minimize

T

1 ,2 ,3 ,4

subject to

1 + 2 ≥ 1

2 + 3 ≥ 1

1 + 3 ≥ 1

2 + 4 ≥ 1

1 + 4 ≥ 1

3 + 4 ≥ 1

T ≥ 1 + 2 + 3 + 4 1 ≥ 0 2 ≥ 0 3 ≥ 0 4 ≥ 0


ICC 2010 2010-05-26

Access to a Fixed-Size Subset of Nodes Regime of High Recovery Probability Proof Idea: Consider (n, r) = (4, 2). This allocation remains optimal even after dropping some r -subset constraints...

storage node 1

x1

storage node 2

...

x2

...

s

r

storage node n

xn


t

0 (n = 4, r = 2, PS = 1) : minimize

T

1 ,2 ,3 ,4

subject to

1 + 2 ≥ 1

2 + 3 ≥ 1

1 + 3 ≥ 1

2 + 4 ≥ 1

1 + 4 ≥ 1

3 + 4 ≥ 1

T ≥ 1 + 2 + 3 + 4 1 ≥ 0 2 ≥ 0 3 ≥ 0 4 ≥ 0


ICC 2010 2010-05-26

Access to a Fixed-Size Subset of Nodes Regime of High Recovery Probability Proof Idea: Consider (n, r) = (4, 2). n We still need T ≥ 2 = r in order to satisfy the highlighted r -subset constraints...

storage node 1

x1

storage node 2

...

x2

...

s

r

storage node n

xn


t

0 (n = 4, r = 2, PS = 1) : minimize

T

1 ,2 ,3 ,4

subject to

1 + 2 ≥ 1

2 + 3 ≥ 1

1 + 3 ≥ 1

2 + 4 ≥ 1

1 + 4 ≥ 1

3 + 4 ≥ 1

T ≥ 1 + 2 + 3 + 4 1 ≥ 0 2 ≥ 0 3 ≥ 0 4 ≥ 0


ICC 2010 2010-05-26


storage node 1

x1

storage node 2

...

x2

...

s

r

storage node n

xn


t

0 (n = 4, r = 2, PS = 1) : minimize

T

1 ,2 ,3 ,4

subject to

1 + 2 ≥ 1

2 + 3 ≥ 1

1 + 3 ≥ 1

2 + 4 ≥ 1

1 + 4 ≥ 1

3 + 4 ≥ 1

T ≥ 1 + 2 + 3 + 4 1 ≥ 0 2 ≥ 0 3 ≥ 0 4 ≥ 0


ICC 2010 2010-05-26


storage node 1

x1

storage node 2

...

x2

...

s

r

storage node n

xn


t

0 (n = 4, r = 2, PS = 1) : minimize

T

1 ,2 ,3 ,4

subject to

1 + 2 ≥ 1

2 + 3 ≥ 1

1 + 3 ≥ 1

2 + 4 ≥ 1

1 + 4 ≥ 1

3 + 4 ≥ 1

T ≥ 1 + 2 + 3 + 4 1 ≥ 0 2 ≥ 0 3 ≥ 0 4 ≥ 0


ICC 2010 2010-05-26

Access to a Fixed-Size Subset of Nodes Regime of High Recovery Probability Proof Idea: Consider (n, r) = (4, 2). Each r -subset constraint appears in exactly one “partition”...

storage node 1

x1

r -subset Constraints

storage node 2

...

x2

...

s

r

storage node n

xn


t

“Partitions”

1 + 2 ≥ 1 1 + 3 ≥ 1 1 + 4 ≥ 1

{1 + 2 ≥ 1, 3 + 4 ≥ 1}

{1 + 3 ≥ 1, 2 + 4 ≥ 1}

2 + 3 ≥ 1 2 + 4 ≥ 1

{1 + 4 ≥ 1, 2 + 3 ≥ 1}

3 + 4 ≥ 1 Distributed Storage Allocation for High Reliability

ICC 2010 2010-05-26

Access to a Fixed-Size Subset of Nodes Regime of High Recovery Probability Proof Idea: Consider (n, r) = (4, 2). For each r -subset constraint removed, we deactivate at most one “partition”...

storage node 1

x1


storage node 2

...

x2

...

s

r

storage node n

xn


t

“Partitions”

1 + 2 ≥ 1 1 + 3 ≥ 1 1 + 4 ≥ 1

{1 + 2 ≥ 1, 3 + 4 ≥ 1}

{1 + 3 ≥ 1, 2 + 4 ≥ 1}

2 + 3 ≥ 1 2 + 4 ≥ 1

{1 + 4 ≥ 1, 2 + 3 ≥ 1}


ICC 2010 2010-05-26

Access to a Fixed-Size Subset of Nodes Regime of High Recovery Probability Proof Idea: Consider (n, r) = (4, 2). To deactivate all 3 “partitions”, we need to remove at least 3 r -subset constraints...

storage node 1

x1


storage node 2

...

x2

...

s

r

storage node n

xn


t

“Partitions”

1 + 2 ≥ 1 1 + 3 ≥ 1 1 + 4 ≥ 1

{1 + 2 ≥ 1, 3 + 4 ≥ 1}

{1 + 3 ≥ 1, 2 + 4 ≥ 1}

2 + 3 ≥ 1 2 + 4 ≥ 1

{1 + 4 ≥ 1, 2 + 3 ≥ 1}


ICC 2010 2010-05-26

Access to a Fixed-Size Subset of Nodes Regime of High Recovery Probability Proof Idea: Consider (n, r) = (4, 2). To deactivate all 3 “partitions”, we need to remove at least 3 r -subset constraints

storage node 1

x1

storage node 2

...

...

s

x2

r

storage node n

xn


t

In other words, we will have at least one active “partition” if fewer than 3 r -subset constraints are removed Therefore, if the required recovery 3 1 probability exceeds 1 − 6 = 2 , then we will n need T ≥ 2 = r , that is, the allocation

1 = 2 = 3 = 4 = 1r = 12 is optimal


ICC 2010 2010-05-26


storage node 1

x1

storage node 2

...

...

s

x2

r

storage node n

xn


t


1 = 2 = 3 = 4 = 1r = 12 is optimal


ICC 2010 2010-05-26


storage node 1

x1

storage node 2

...

...

s

x2

r

storage node n

xn


t


1 = 2 = 3 = 4 = 1r = 12 is optimal


ICC 2010 2010-05-26

Access to a Fixed-Size Subset of Nodes Regime of High Recovery Probability Intervals of recovery probability PS over which 1 the allocation  = r ,  = 1, . . . , n, is optimal, for n = 40 recovery probability

Ps

1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

5

10

15

20

25

subset size r Distributed Storage Allocation for High Reliability

ICC 2010 2010-05-26

30

35

40

Access to a Fixed-Size Subset of Nodes Summary

storage node 1

x1

1

We are able to describe the optimal allocation in the regime of high recovery probability

2

Question: For the general problem, how much will we lose if we were to consider only symmetric allocations?

3

Question: What is the optimal symmetric allocation?

storage node 2

...

...

s

x2

r

t

storage node n

xn



ICC 2010 2010-05-26


storage node 1

x1

1


2


3


storage node 2

...

...

s

x2

r

t

storage node n

xn



ICC 2010 2010-05-26


storage node 1

x1

1


2


3


storage node 2

...

...

s

x2

r

t

storage node n

xn



ICC 2010 2010-05-26

Thank you!

Table of Contents 1

2

3

Introduction Motivating Example Key Question General Problem Description Storage Allocation Access by a Data Collector Objective Related Work Access to a Fixed-Size Subset of Nodes Problem Description Some Numerical Results Special Case of Probability-1 Recovery Regime of High Recovery Probability Summary

Distributed Storage Allocation for High Reliability

Distributed Storage Allocation for High Reliability

Suggest Documents

Distributed Storage Allocation Problems - CiteSeerX

Distributed Storage Allocation Problems - Erasure Coding for ...

Proportional Service Allocation in Distributed Storage Systems

Optimal Task Allocation for Maximizing Reliability in Distributed Real ...

On Storage Allocation for Maximum Service Rate in Distributed ... - arXiv

Reliability for Networked Storage Nodes

Efficient Allocation and Composition of Distributed Storage - CiteSeerX

Quota enforcement for high-performance distributed storage systems

Dynamic Storage Allocation Systems

TASK ALLOCATION MODEL FOR RELIABILITY AND

Reliability Allocation for Fault Tolerant Software

A Comprehensive Reliability Allocation Method for

Efficient Algorithms for Persistent Storage Allocation - CiteSeerX

Storage Allocation for Embedded Processors - CiteSeerX

storage capacity allocation algorithms for ... - Semantic Scholar

Efficient Algorithms for Persistent Storage Allocation - CiteSeerX

Reliability-Constrained Resource Allocation with

Dynamic Distributed Resource Allocation: A Distributed ... - CiteSeerX

Distributed Power Allocation for Vehicle ... - Semantic Scholar

Prioritized Threshold Allocation for Distributed Frequency

Distributed Resource Allocation Strategies for ... - Semantic Scholar

Distributed Resource Allocation for Femtocell Interference ... - CiteSeerX

Auction-Based Distributed Resource Allocation for Cooperation

Distributed Resource Allocation for Virtualized ... - Semantic Scholar