A peer-to-peer file storage and sharing system∗ Rohit Katiyar(Y6396),
[email protected]
Nand Kishor Bansal(Y6277),
[email protected]
Abstract
Napster [7]) aimed at creating cooperative file storage and sharing system. These applications harness idle storage and network resources of volunteer peers to create an online storage and sharing system – yet these systems are met with challenges of being reliable and scalable. In this project, we propose a decentralized and unmanaged P2P system, which provides an online, fault-tolerant, strongly persistent storage available to multiple users simultaneously. The system consists of nodes connected to the network, where each node is identical, and performs identical jobs. Each node can send messages to other nodes, and is responsible for routing messages sent by other nodes. A node contributes to the network by accepting and storing an amount of data from the network into its own drives. Node could also post data to the network, and download data that other nodes have posted. The system’s total storage capacity is aggregate of small amount of data that each node allows to store on her local drive. Thus in a network consisting thousands of nodes, this is seemingly limitless space. The nodes together form a self-organizing and decentralized overlay network. Inserted files are replicated over multiple nodes to allow peers to join and leave with high frequency without affecting the persistence of stored data. The system uses optimal erasure codes based on Reed-Solomon codes[12] instead of simply duplicating files over nodes. We shall show that erasure codes provide a high availability of files where expected availability of individual node could be as low as approx. 25% of the time. The system uses a routing scheme based on Pastry[14] location protocol to route clients requests to appropriate nodes. Pastry can reliably route a client request in number of hops at most logarithmic order in total number of nodes in network. The peer application also includes a hierarchical directory structure, thus allowing users to browse for files. The system adds semantics and meta-data over data blocks shared over the network, a peer interprets this meta-data and provides a directory like structure to user. The directory structure is globally shared amongst all peers in completely decentralized manner, in the sense that, there would not have to be a central authority to maintain base meta-data and indexes. The client application or peer is implemented for MS
Use of peer-to-peer technologies have been prevalent in sharing files and media over internet. This project proposes a new cooperative file sharing and storage system that allows users to store, share and find files over a robust and scalable network. This system is based on peer-to-peer technologies and provides a self-managed, decentralized, efficient, load-balanced and distributed way to store and retrieve files on peer nodes. Each node donates storage space to the system and also does routing of client messages based on Pastry[14] location protocol. This system uses replication based on erasure codes for increasing efficiency in providing persistent storage. The system also provides its users with a global hierarchical directory structure to browse for files – like a file-system. We present the system, its features, its architecture design and our progress in this report.
1
Introduction
Peer-to-peer(P2P) can be defined a network where significant part of network’s functionality is provided by distributed peers, rather than being implemented on centralized server or multiple servers. A peer is a participant of such peer-to-peer network, which generally is a single program running on multiple hosts. Broadly speaking, P2P network can be classified as either decentralized or centralized. Decentralized means functionality is equally implemented by all or most of the peers in the network. Centralized means that implementation of functionality also uses programs that are not peers, namely centralized servers, that are different from peers and run on relatively small number of hosts. P2P networks have become very popular over last decade – of which file sharing is the most widley used application. A P2P file sharing system is a network which facilitates transmission of files between peers. Such network allows peers to download files from other participants of network, and also allow to mark a set of files from their local file system to be shared. These shared files are available to other users for download. Today, there are a number of peer-to-peer systems(Gnu-tella[5], Bit-Torrent[2], Freenet[4], and ∗ CS499:
Dr. R K Ghosh
[email protected]
Btech project final term report.
1
Windows platform and has been tested to work on MS Windows versions Windows XP, Windows Vista and Windows 7.
2
lowing subsections we give brief overview of implementation:
Related Work
This project is inspired by several existing peer-to-peer systems such as Napster[7], Gnutella[5], Freenet[4]. This section covers related work done in this area:
2.1
Distributed Hash Table Figure 1: System architecture: showing interaction between components of nodes
A Distributed Hash Table, or DHT, provides a (key, value) pair lookup service similar to a Hash Table. Difference between DHT and ordinary Hash Table is that in a DHT responsibility for managing mapping from key to data values is distributed among multiple hosts, without any fixed hierarchy. Our project layers on top of Pastry[14] Distributed Hash Table(DHT) implementation for lookups of data blocks.
2.2
3.1
When a clients starts, it must find at least a node already in the network, and from then on client dynamically builds and maintains the routing table. Finding this first node to join is called Bootstrapping and is implemented using the following 2 methods:
Overlay Network
An overlay network is a computer network that is built on the top of any other network. Overlay networks are constructed to permit routing of messages to destinations for which IP addresses are not known. We used distributed routing algorithm based on Pastry[14] overlay network.
2.3
3.1.1
Erasure Codes 3.1.2
Peer Cache
Multicast based approach in not sufficient for wide area network or LANs which do not enable multicast. A Peer Cache based node discovery mechanism can also be used to find address of the first node to join. A list of most frequently available peers is provided with the program, which is used for finding peers for first session. Our implementation uses both strategies, first client does a multicast range search to find any peer in the same LAN, if unsuccessful client checks addresses from the peer cache. Also before ending each session, client stores addresses of frequently online peers in the peer cache, which in turn is used in the subsequent sessions.
Shared File Systems
A number of P2P applications provide shared file systems which are distributed and fault-tolerant. These file systems are used in both high performance computing and online storage systems. Cooperative File System[10], PAST [11] and Wuala[9] are online distributed file system that focus on scalability. Freenet[4] is also an online storage and sharing system that focuses more on providing anonymity.
3
Multicast Search
In multicast based approach, all peers in a multicast enabled LAN join a multicast group common to all. On start-up client sends a query to common multicast group to get the IP address of one or more of the other participants and connect to them afterwards.
Erasure codes are special type of forward error correction (FEC) codes, that have been used in RAID disk systems to minimize replication ratio and still achieve high persistent storage. HDFS[6], a distributed file system, which is a part of Apache Hadoop[1] also uses erasure codes for replication.
2.4
Bootstrapping
3.2
Pastry
We briefly describe Pastry[14] location and routing protocol in this subsection. Pastry is an overlay network that perform routing based on a Distribute Hash Table(DHT). Each node is assigned a unique 128-bit identifier nodeId, generated by hashing node’s address. Each file that is inserted into the network is assigned a
System Design
The core system architecture consists mainly of a GUI, Pastry[14] location scheme, an erasure encoderdecoder, and a global directory structure. In the fol-
2
160-bit fileId, generated by hashing together file name and owner’s nodeId. For a given fileId, Pastry routes message to a live node which has nodeId numerically closest to most significant 128 bits of the fileId. Pastry is bootstrapped by first giving IP Address of a node already on the overlay network. From then on routing table is dynamically built and maintained. Pasty can locate closest node in less than dlog2b ne steps on average, where n is number of nodes in the network. The table required in each pastry node have only (2b − 1) ∗ dlog2b ne + 2l many entries. After a network failure the invariant in all nodes can be restored bye passing around O(log2b n) messages. In addition to the routing table, a pastry node stores IP addresses of its neighbours in leaf set. Since fileId s are distributed randomly over nodeId space, Pastry approximately balances load stored at each node.
3.3
code blocks are available. File availability: Pav =
Pm
i=k
m i
pi (1 − p)m−i
For, p = 0.25 , m = 100 , k = 20 , Redundancy Factor r = m/k = 5.0 Pav = 0.90 For, p = 0.25 , m = 500 , k = 100 , Redundancy Factor r = m/k = 5.0 Pav = 0.99 These two example show that using Erasure-codes provides higher file availability probability, even if individual nodes are available 25% of the time.
3.4
Global Directory Structure
All peers in the network share a global hierarchical structure for organizing files. This directory structure organizes file nodes under directory nodes, which creates a tree like structure – not unlike a file system. Every directory node or file node that gets created or changed by any peer is updated into directory structure of all peers. However, those peers which were offline during this change do not receive the update. Each peers start with their old directory structure, which is marked stale in beginning. As user tries to view a particular directory node in the so called stale directory structure, it sends out an update request. An update request includes last update time of current directory. All other peers that that have any latest updates, reply back to this update request, which again includes time of update. Original sender picks the latest of these updates and makes appropriate changes in its directory structure.
Optimal erasure codes
To ensure high availability of files, data blocks are replicated over multiple nodes. However simple replication techniques are inefficient in term of too much overhead. We use optimal erasure codes[13] of ReedSolomon code[12] family. An erasure code is a forward error correction (FEC) code, which given a message of k blocks recodes it to m code blocks where m > k. Optimal erasure codes have the property that any k out of m code blocks are sufficient to reconstruct original message. A file is erasure encode into m/k blocks and these m code blocks are distributed to different hosts on the network using Pastry protocol. As long as any k out of these m blocks are available the original file can be reconstructed.
4
Persistence Following calculations show that erasure coding gives higher file availability probability given the same redundancy factor and node availability probabilities:
Conclusion
In our first term proposal, we promised to build a P2P system where both control and data are decentralized. We also promised to include a file system integration, which would allow applications to directly access files stored in the network. If this was implemented and working, music and video files could be watched before they have been completely downloaded. We intended to use a Dokan[3] libray, to implement virtual file system on MS Windows platform. However, we could not achieve desired results with Dokan libray. Instead, We have created a client GUI that provides a file browser, user need to first download and copy the files to a directory on local file system and then open from there. During the first term we had finished implementation of an overlay network based on Pastry[14] routing protocol, and a library for Erasure coding and decoding. Our work during second term involved – designing
(a) Replication: In this scheme, a file is entirely replicated on multiple hosts. File would be available if any one of the r replicas are available. File Availability: Pav = 1 − (1 − p)r For Node Availability p = 0.25 , and Redundancy Factor r = 5, Pav = 0.76 (b) Erasure codes: In this scheme a file is broken into k blocks, which are resampled into m code blocks. This gives an effective redundancy factor of m/k. A file can be reconstructed as long as any k out of m
3
and implementing a global directory structure to manage files; creating a GUI to ease file browsing, uploading(sharing) and downloading; and integration of these components to build the final peer software. In this project we identified a number of issues with P2P networks. After considering a number of alternative implementations, we designed and built a file sharing system that tried to solve those identified issues. Major contribution of this project are to built a file storage and sharing system from a file distribution system by adding meta-data on the shared files. We also achieved a completely decentralized model for sharing files.
[7] Napster, an online music file sharing service, April 2010. http://www.napster.com. [8] Overlay network. From Wikipedia, the free encyclopedia, April 2010. http://en.wikipedia.org/ wiki/Overlay_network. [9] Wuala, secure online storage - backup. store. share. access everywhere, April 2010. http:// www.wuala.com/. [10] Frank Dabek, M. Frans Kaashoek, David Karger, Robert Morris, and Ion Stoica. Wide-area cooperative storage with CFS. In Proceedings of the 18th ACM Symposium on Operating Systems Principles (SOSP ’01), Chateau Lake Louise, Banff, Canada, October 2001.
References [1] Apache hadoop, open-source software for reliable, scalable, distributed computing, April 2010. http: //hadoop.apache.org/. [2] Bittorrent website), April 2010. bittorrent.com.
[11] P. Druschel and A. Rowstron. Past: A large-scale, persistent peer-to-peer storage utility. In HotOS VIII, Schoss Elmau, Germany, May 2001.
http://www.
[3] Dokan: user mode file system for windows. http: //dokan-dev.net/en/, April 2010.
[12] M. S. Manasse, C. A. Thekkath, and A. Silverberg. A reed-solomon code for disk storage, and efficient recovery computations for erasure-coded disk storage. Available at: http://research.microsoft. com/en-us/projects/kohinoor/wdas.pdf.
[4] Freenet, a distributed anonymous information storage and retrieval system, April 2010. http: //freenetproject.org/.
[13] Luigi Rizzo. Effective erasure codes for reliable computer communication protocols. Computer Communication Review, April 1997.
[5] Gnutella, a file sharing network, April 2010. http: //en.wikipedia.org/wiki/Gnutella.
[14] A. Rowstron and P. Druschel. Pastry: Scalable, decentralized object location and routing for largescale peer-to-peer systems. In IFIP/ACM International Conference on Distributed Systems Platforms (Middleware), pages 329–350, Heidelberg, Germany, November 2001.
[6] Hadoop distributed file system (hdfs), April 2010. http://hadoop.apache.org/hdfs/.
4