Stephen L. Scott, Christian Engelmann. {scottsl ... Metadata Management using Bloom Filters ... A Bloom filter is a fast and efficient method for representing a set.
A Highly Available Cluster Storage System Using Scavenging High Availability and Performance Computing Workshop (HAPCW'04), held with 2004 Los Alamos Computer Science Institute Symposium (LACSI’2004)
Xubin (Ben) He, Li Ou
Stephen L. Scott, Christian Engelmann
{hexb, lou21}@tntech.edu
{scottsl, engelmannc}@ornl.gov
Storage Technology & Architecture Research(STAR) Lab Electrical and Computer Engineering Department
Computer Science and Mathematics Division
Outline
Introduction Highly Available Metadata Management Metadata Management using Bloom Filters Conclusions and Future Work
A Highly Available Cluster Storage System Using Scavenging, HAPCW'2004
2
NOW Handout Page 1 CS258 S99
1
Motivations • Data intensive scientific applications Æ high performance computing. • Distributed storage systems: high performance storae, wide area mass storage, cluster storage. • Excellent performance, parallel support ÅÆadministration costs, central points of failure and control
A Highly Available Cluster Storage System Using Scavenging, HAPCW'2004
3
Existing work: • HPSS • Peer-to-Peer Storage • Scavenging [Vazhkudai]
Grid Data Access
Grid Storage Management
Scavenger Registration
Scavenged Storage Morsels
A Highly Available Cluster Storage System Using Scavenging, HAPCW'2004
4
NOW Handout Page 2 CS258 S99
2
9
Introduction Highly Available Metadata Management Metadata Management using Bloom Filters Conclusions and Future Work
A Highly Available Cluster Storage System Using Scavenging, HAPCW'2004
5
User Data vs. Metadata Metadata manager
Storage server
Client
A Highly Available Cluster Storage System Using Scavenging, HAPCW'2004
6
NOW Handout Page 3 CS258 S99
3
Distributed Metadata Management Client Access
Cluster Storage Management
Scavenge Manager Unified Interface
Metadata Management Layer
Scavenged Storage Morsels
High Availability: Scheme #1 (Pure P2P)
Peer-To-Peer Network
Metadata
Manager
NOW Handout Page 4 CS258 S99
4
High Availability: Scheme #2 (active/hot-standby)
Peer-To-Peer Network
Hot-Standby Manager
Active Manager
Metadata
High Availability: Scheme #3(Partitioning) Master/Load Redirector
Network
Metadata
Manager
NOW Handout Page 5 CS258 S99
5
High Availability: Scheme #4(leader/helper)
Network
Metadata
Helper
Leader
Group Communications • Peer-to-Peer distributed control • Reliable broadcast, Atomic Broadcast • Atomic transactions guarantee metadata integrity
A Highly Available Cluster Storage System Using Scavenging, HAPCW'2004
12
NOW Handout Page 6 CS258 S99
6
9
Introduction Highly Available Metadata Management Metadata Management using Bloom Filters Conclusions and Future Work
A Highly Available Cluster Storage System Using Scavenging, HAPCW'2004
13
Bloom Filter • A Bloom filter is a fast and efficient method for representing a set A={a1,a2,…,an} of n elements to support membership queries. Element a
1
h1(a)=P1
h2(a)=P2 h3(a)=P3
1 1
m bits
h4(a)=P4
1
A Highly Available Cluster Storage System Using Scavenging, HAPCW'2004
14
NOW Handout Page 7 CS258 S99
7
Metadata Management using Bloom Filters File queries Scavenge Manager Unified Interface
Hashing and LRU Cache
hit
Update
miss hit
Bloom Filter Array
miss Multicast the queries
15
Introduction Highly Available Metadata Management Metadata Management using Bloom Filters 9 Conclusions and Future Work
A Highly Available Cluster Storage System Using Scavenging, HAPCW'2004
16
NOW Handout Page 8 CS258 S99
8
Conclusions •
Investigate Availability issues Scavenged Storage Systems, propose 4 solutions for maintaining multiple metadata managers: – – – –
•
P2P Active/hot-standby Partitioning Leader/Helper
Speed up the metadata searching: –
Bloom Filters
A Highly Available Cluster Storage System Using Scavenging, HAPCW'2004
17
Future Work
• Comparing the proposed 4 schemes • Scalability • Metadata Cache
A Highly Available Cluster Storage System Using Scavenging, HAPCW'2004
18
NOW Handout Page 9 CS258 S99
9
Acknowledgements • Research Office and Center for Manufacturing Research, Tennessee Technological University • Ralph E. Powe Junior Faculty Enhancement Award by Oak Ridge Associated Universities (ORAU). • Mathematics, Information and Computational Sciences Office, Office of Advanced Scientific Computing Research, Office of Science, U. S. Department of Energy.
A Highly Available Cluster Storage System Using Scavenging, HAPCW'2004
19
Questions and Comments?
A Highly Available Cluster Storage System Using Scavenging, HAPCW'2004
20
NOW Handout Page 10 CS258 S99
10
A Highly Available Cluster Storage System Using Scavenging High Availability and Performance Computing Workshop (HAPCW'04), held with 2004 Los Alamos Computer Science Institute Symposium (LACSI’2004)
Xubin (Ben) He, Li Ou
Stephen Scott, Christian Engelmann
{hexb, lou21}@tntech.edu
{scottsl, engelmannc}@ornl.gov
Storage Technology & Architecture Research(STAR) Lab Electrical and Computer Engineering Department
Computer Science and Mathematics Division
NOW Handout Page 11 CS258 S99
11