Design and Implementation of BitTorrent File System for ... - IEEE Xplore

4 downloads 118 Views 197KB Size Report
Design and Implementation of BitTorrent File System for Distributed Animation Rendering. Namfon Assawamekin. University of the Thai Chamber of Commerce,.
2013 International Computer Science and Engineering Conference (ICSEC): ICSEC 2013 English Track Full Papers

Design and Implementation of BitTorrent File System for Distributed Animation Rendering Namfon Assawamekin University of the Thai Chamber of Commerce, Bangkok 10400, THAILAND Phone: +66(0) 2697-6506, Fax: +66(0) 2277-7007 [email protected], [email protected]

Ekasit Kijsipongse National Electronics and Computer Technology Center, Pathumthani 12120, THAILAND Phone: +66(0) 2564-6900, Fax: +66(0) 2564-6901 [email protected]

rendering service that distributes rendering process over the Internet.

Abstract—Rendering is a crucial process in the production of computer generated animation movies. It executes computer programs to produce series of images which will be sequenced into a movie. However, rendering process on a single machine can be tedious, time-consuming and unproductive, especially for 3D animation. To resolve these problems, animation rendering is commonly carried out in a distributed computing environment where the rendering is distributed to a number of computers. As a result, efficient handling of such distributed animation rendering becomes a major challenging research to increase performance and reduce cost for numerous animation industries. In this paper, we describe the design and implementation of BitTorrent File System (BTFS) used to improve the communication performance of distributed animation rendering. The BTFS efficiently and transparently provides data sharing among distributed computers in a peer-to-peer manner to reduce the data access time for any rendering software. The experimental results and evaluation are verified as a proof-ofconcept of our architecture. Some interesting future enhancement is also discussed.

Originally, volunteer computing uses central servers for distributing data and computing to clients [3]. There is no notion of data exchange between clients. However, in animation rendering, the 3D models and related library files are large. These input files have to be transferred to the clients before the rendering can begin on the clients. The transfer time is significant due to the latency of the public Internet. Additionally, centralized servers can become overloaded when there are too many clients requesting for the data which will slow down the rendering process. Since the same files may be used by several clients almost the same time, it is a great opportunity to coordinate the file transfer among clients in the peer-to-peer (P2P) manner to reduce the data transfer time. With the peer-to-peer model, a client who has already downloaded a whole or some parts of the file from the central servers can share the file to other clients so that others can directly download the file (or parts of it) from the former client instead of the central servers. As there are more copies of the files on different clients, there are fewer requirements to download data from the central servers.

Keywords-animation rendering; BitTorrent; distributed file system; peer-to-peer

I.

INTRODUCTION

To allow rendering software to transparently access the shared files over the P2P model without having the applications modified, it is necessary to implement the P2P file sharing service as the file system layer in the operating system. Thus, transferring the files from several peers across the network is invisible to the applications such that the shared files can be treated the same as files on local disks.

Animation rendering is a process that transforms 3D models into hundred thousands of image frames to be composed into a movie. Rendering process is very computing intensive and time-consuming. A single frame of an industriallevel animation can even take several hours in rendering on a single machine. So, animation rendering is then typically carried out on a set of high performance computers where distributed rendering is taken place in which frames are distributed and rendered across many machines in a network to reduce the overall rendering time. Distributed rendering comes in many flavors such as render farm and volunteer-based rendering. For render farm, machines are dedicated for rendering task and all machines are tightly-coupled in local area network having high bandwidth. In volunteer-based rendering [1], machines are loosely connected by the public Internet and the machine owners provide the idle time of their computing resources for rendering. For example, Renderfarm.fi [2] is a large-scale, volunteer-based and loosely-coupled

978-1-4673-5324-3/13/$31.00 ©2013 IEEE

This paper presents the design and implementation of BitTorrent File System (BTFS) which efficiently and transparently provides data sharing among distributed computers in a peer-to-peer manner to improve the communication performance of distributed rendering. The rest of the paper is organized as follows. Section 2 gives a brief overview of related work. Section 3 describes the architecture as well as the implementation of the BTFS. Section 4 presents the experimental setup and results. Finally, we draw conclusions and discuss the ongoing work in Section 5.

68

2013 International Computer Science and Engineering Conference (ICSEC): ICSEC 2013 English Track Full Papers

II.

work also differs from cloud-based storage services like dropbox [12] in that they synchronize the entire file system into client’s local storage which is less preferable if only small number of files are used in rendering.

RELATED WORK

BitTorrent [4] is a P2P file sharing protocol that has been utilized in many applications to improve the performance of data transfer. For example, Kaplan et al. [5] proposed GridTorrent that makes use of BitTorrent to efficiently distribute data for scientific applications in Grid computing environment. During the same time, Zissimos et al. [6] independently created another GridTorrent project that integrates BitTorrent with Globus Grid middleware components. They replaced the “.torrent” metainfo file with the Replica Location Service (RLS) in Globus to simplify the bootstrapping step in the native BitTorrent protocol.

III.

DESIGN AND IMPLEMENTATION

The distributed rendering environment that we focus consists of a group of central servers and a large number of distributed render clients which may be located in the local or wide area networks. Users submit rendering jobs to the servers and there exists a job scheduler which will dispatch the jobs to render on the clients. Each job specifies the rendering program with all necessary arguments including the names of input and output files. All input files which are necessary for a client to render jobs are initially available on the servers. The input files must eventually be transferred from the servers or peers to the client on which the job is executed. The output files are later transferred back from the clients to the servers.

For volunteer-based computing, Costa et al. [7] applied BitTorrent to optimize data distribution on BOINC [8] middleware which is the open infrastructure for volunteering computing. They showed that BitTorrent can reduce the network load at servers significantly while having minimal impact to the computing time at clients. Wei et al. [9, 10] implemented BitTorrent in the computational desktop grid platforms including XtremWeb [11] to collaborate data distributing among users in solving scientific problems. They showed that even if BitTorrent protocol has more overhead than typical file transfer protocols, it can outperform when distributing large files to a high number of nodes.

Since in animation rendering, a lot of jobs running on different clients require the same input data. When a client needs to download input data for a particular job, the same data may already be existed on other clients. The client can coordinate with other clients to download the input data from them in the peer-to-peer (P2P) manner instead of from the central file servers. We use BitTorrent protocol as the means to disseminate the data among clients.

Although, the aforementioned works are similar to ours; several points remain distinct. Firstly, our implementation uses hierarchical unix-like file namespace; while some others use flat namespace. Secondly, most of them have no cache management at the client side for data that are repeatedly used. Thirdly, all of them provide specific APIs which require software modification, and thus they are not transparent to the applications running at the upper layer. Lastly, the evaluations were done with scientific or even synthetic applications. Our

We design BitTorrent File System (BTFS) to function at file system layer in the Linux operating system to provide any applications the transparent access and file sharing with peers. The implementation of BTFS is based on Filesystem in Userspace (FUSE) [13] as illustrated in Figure 1. The system consists of 4 main components: metadata server, seeder, tracker and BTFS client, which are described below.

Exchange files from other BitTorrent clients

Tracker

Update peers information BitTorrent Client

Download files Seeder

Storage

Metadata Server

1

Local Cache

1 = Request new files 2 = Files are already in harddisk

Render Clients Figure 1. BitTorrent file system architecture.

69

2

Fuse File System

Get .torrents if permission is granted

Central File Server

Render Application

2013 International Computer Science and Engineering Conference (ICSEC): ICSEC 2013 English Track Full Papers

B. Seeder Every file in BTFS must be uploaded into the central file server for seeding. This file server is called a seeder which is responsible to initially serve files for BTFS clients. Seeder should be located in the public network so that other clients can always reach. In a more complex deployment, there can be multiple (and possibly remotely distributed) seeders to concurrently serve the files or a single file can be replicated on multiple seeders for parallel download if necessary.

A. Metadata Server The metadata server provides the information about the file and directory structure of the BTFS file namespace such as file size or last modified time. All files have to be registered with the metadata server and unregistered when they are no longer needed. The register is possible only if the path is not already occupied by another file in the namespace. The metadata server is also responsible to enforce the file permission and serve clients the torrent information of the requested files. Each file in BTFS is associated with an individual torrent. Regarding to BitTorrent protocol [4], the contents of the torrent information include filename, file size, hash information, number of pieces, seeder and tracker URLs like other .torrent which will be used in the P2P file sharing. In the current work, there is a single metadata server for each namespace.

The current implementation uses a WEBDAV server as the seeder for BTFS clients to upload a new file. Since all files are uniquely identified by its UUID which is unlikely to duplicate, there is no need to maintain directory structure in the seeder. Thus, all files are stored in the same directory on the seeder for simplicity.

We use Apache ZooKeeper 3.4.3 [14] to implement the metadata server. ZooKeeper is a distributed coordination service for distributed systems. It organizes data into a hierarchy of nodes similar to files and directories. The top level directories in ZooKeeper consist of /btfs and /config nodes. The /btfs node is used to hold the root directory of a BTFS namespace. It is where the user files and directories are placed. The BTFS users will only see the files and directories under the /btfs node as the normal POSIX files on their computers. The /config node is used to store the global system configuration for BTFS clients. We manage the client configuration through ZooKeeper so that the configuration updates can be distributed to all clients easily.

C. Tracker According to the BitTorrent protocol, trackers are required to coordinate all BitTorrent peers to share files. This is also the case with BTFS clients which run the same protocol. Any common BitTorrent trackers can be used. D. BTFS Client BTFS client is a core component that glues other components together. It runs on client machines where rendering software access the shared files. BTFS clients intercept all file system calls such as open() and read() from any applications which request access to files in the BTFS file system. For reading a file, BTFS clients contact the metadata server for attributes and the torrent information of the file, and create a BitTorrent client thread to download the file from seeders and peers. The downloaded file will be temporarily cached into the local storage of the client and then passed to the requesting application. The file is kept as long as the temporary space is available; otherwise, the cache replacement algorithm is invoked to clear unused files. We use the Least Recently Used (LRU) policy in the cache replacement. Future requests of the file will be redirected to the cached file if available to reduce the network traffic. This cached file can be exchanged to other BTFS clients like in P2P file sharing as well.

Each file and directory in the BTFS namespace is associated with a descendant node under the /btfs node. We store the BTFS file (or directory) attributes in a ZooKeeper node’s data. The essential BTFS attributes include the Universally Unique IDentifier (UUID), last modification time and file size. These attributes will be mapped into the appropriated fields in the POSIX stat structure. Figure 2 shows the ZooKeeper hierarchical structure and the corresponding BTFS file system. All files have different UUIDs. The torrent node in the ZooKeeper tree is created as a child of each user file. It is used to store the torrent information of a file to be shared in BTFS. ZooKeeper Node

Directory

When writing a new file, BTFS clients perform the writing operation locally. After the written file has completely been closed, it must be uploaded into the seeder and the BTFS client must register the file into the metadata server. The POSIX attributes of the file will be stored in the associated ZooKeeper node. Other basic operations such as deleting a file, listing files, creating and removing a directory are also supported. However, our current implementation has not supported rewriting operation yet.

File

/

btfs Mary

config Mary

John tmp

/

model a.blend

John

tmp

model

E. Operation Since BTFS is implemented as a file system in the user space, each user can mount a BTFS file system to a mount point in the local directory tree that he/she has permission. The command to mount BTFS file system is as follows.

a.blend

a.blend.torrent

(a) ZooKeeper tree structure

(b) BTFS file system

Figure 2. Mapping from ZooKeeper tree to BitTorrent file system.

$btfsmount 192.168.1.1 /home/user1/btfs

70

2013 International Computer Science and Engineering Conference (ICSEC): ICSEC 2013 English Track Full Papers

The first argument is the IP address of the metadata server. The second argument is the local mount point. Once the BTFS is mounted, the user can access the files and directories as if they were files or directories in the local storage. IV.

files (i.e., the .blend and all referenced files) are accessible from the clients when jobs are executed on clients. Thus, we setup the BTFS and SAMBA 3.5.10 [18] for this regard. Note that SAMBA is a network file system which is frequently used to share files in distributed rendering. SAMBA uses a centralized file server implementing the SMB/CIFS protocol. To access files from SAMBA, we use FuseSMB [19] and Linux CIFS 2.56.32 [20] file systems, both of which implement the SMB/CIFS protocol in the user-space and kernel-space at the clients, respectively. We also compare the performance of animation rendering under different file systems.

EXPERIMENTAL RESULTS

This section describes the experimental setup and the preliminary results when using BTFS in distributed rendering. A. Testbed System Configuration We have carried out the experiments on a testbed system working as a distributed render farm. The testbed consists of a set of servers located at a central site and multiple clients from different remote sites to simulate the volunteer-based rendering which users donate their desktop or notebook computers for rendering animation of a particular project.

TABLE I.

We setup the testbed on 5 remote sites, i.e. NECTEC, UTCC, INET, CAT and CSLOXINFO, in which NECTEC is chosen as the central site to place the metadata server, seeder and tracker. We allocate 7 clients from the remaining sites. For the hardware specification of all client machines, the CPU speed slightly varies between 2.2 - 2.6 GHz; however, they all have 4 CPU cores and 4 GB RAM with free harddisk space at least 50 GB. They connect to the Internet with public IP addresses bypassing any enterprise firewalls. The network bandwidth between sites is slightly varied over time around 100 Mb/s bi-directional. However, the outgoing bandwidth from the central to remote site is throttled to 10 Mb/s for the experiments. All machines are installed with Linux CentOS 6.2.

CHARACTERISTICS OF THE TESTING DATA

Job Size

Scene

No. of Frames

Mem (MB)

Small Large

12_peach/03.blend 01_intro/02.blend

28 91

650 3,500

Avg. Time (min. per frame) 1:30 9:06

Input Size (MB) 90 290

C. Performance Comparison of BTFS and SMB File Systems We submit small and large jobs as mentioned previously to the testbed. All jobs are rendered with 25% of full HD resolution (480x270 pixels). A single file server (or seeder in case of BTFS) is used. Then, we measure the average time to render the jobs in the testbed as depicted in Figure 3. The total rendering time varies for each file system used, which BTFS gives the best time, then CIFS the second; while FuseSMB is the worst. These result from the reduction of data transferred from the server. FuseSMB never caches the files locally. The same data have to be retransmitted from the server for rendering every frame. In contrast, CIFS implements page cache which is capable to cache all relevant files in memory automatically for the next frames. So, it significantly reduces the data transfer time. Nevertheless, when comparing to BTFS, the performance of CIFS is slightly worse for the small job since there is no peer to share the data. Interestingly, when the job becomes large, the performance of CIFS happens to be remarkably lower than that of BTFS. This is due to the job requiring large memory so that only small free memory is left and page cache is not effective. As a result, most data still have to be retransmitted from the server. On the other hand, since BTFS caches the files in persistent storage, there is no such free memory effect.

B. Render Data and Software The data used in the experiments are from the Big Buck Bunny project [15] which is the open animation movie initiated by the Blender software [16] development team. The entire animation has 10 minutes running time which is generated from more than 400 source files in 1.2 GB data. The project has publicly released all 3D models, images and texture files that have been used during their production. The animation is composed of 13 scenes separated in the top level folders under the project directory. Each scene may further be broken into sub-scenes each of which is stored as a .blend file. When rendering a .blend file, other files referenced from the current file have to be existed. Different scenes have distinct computational requirements. Some scenes can finish rendering in a few minutes and use only small memory; while others may take an hour with much large memory. Thus, we select two scenes representing different computational requirements as small and large size as shown in Table 1. The input size is the total size of the .blend and all referenced files required for rendering. Note that, the rendering time is measured when files are in local disk.

30000

Rendering Time (s)

25000 20000

BT FS CIFS

15000

FuseSMB

10000 5000 0

We deploy DrQueue 0.63.4 [17], the job scheduler for distributed render farm, on the testbed. The DrQueue master process is installed on the server at NECTEC site. The DrQueue slave process and Blender 2.49 [16], the rendering software, are installed on all clients. It is required that the input

Small

Large

Job S ize

Figure 3. Rendering time under different file systems.

71

2013 International Computer Science and Engineering Conference (ICSEC): ICSEC 2013 English Track Full Papers

We also measure the load on the file server (or seeder) in terms of the amount of data downloaded when rendering the jobs. Figure 4 illustrates that only small amount of data are downloaded from the BTFS seeder as clients can exchange data in the P2P manner which makes the load on the seeder the lowest. On the contrary, the loads on the server under CIFS and FuseSMB are much higher especially when the job is large.

REFERENCES [1]

[2] [3] [4]

Downloaded Data (GB)

30

[5]

25 20

BT FS

15

CIFS

[6]

FuseSMB

10 5

[7]

0 Small

Large

Job S ize

[8]

Figure 4. Load on the file server (seeder). [9]

V.

CONCLUSIONS

This paper presents the design and implementation of BitTorrent file system, or BTFS, which aims to reduce the data transfer time and improve the performance of distributed rendering. The BTFS allows the rendering software as well as other applications to share and exchange data in a peer-to-peer and transparent manner. Many components in BTFS are built around well-developed open source software and standard protocols for the BTFS to gain their ability, maturity and stability. We have carried out the experiments on a testbed using a production-grade 3D animation. The results show that the performance of distributed rendering using BTFS is better than the traditional network file systems and the load on the file server is much reduced. The security, scalability and highavailability of BTFS will be addressed for future work.

[10]

[11]

[12] [13] [14] [15] [16] [17]

ACKNOWLEDGMENT The authors are supported by the grant from the research fund at University of the Thai Chamber of Commerce. We are greatly in debt of their generous financial support for this research. The authors also thank the National Electronics and Computer Technology Center for providing access to computing resources used in this work.

[18] [19] [20]

72

“Distributed Rendering,” Available at http://www.isgtw.org/visualization/distributed-rendering/, September 7, 2011. “Free Rendering by the People for the People,” Available at http://www.renderfarm.fi/, 2012. “Volunteer Computing,” Available at http://boinc.berkeley.edu/trac/wiki/VolunteerComputing/, 2012. B. Cohen, “Incentives Build Robustness in BitTorrent,” Workshop on Economics of Peer-to-Peer Systems, Berkeley, CA, USA, May 22, 2003. A. Kaplan, G.C. Fox, and G.v. Laszewski, “GridTorrent Framework: A High-Performance Data Transfer and Data Sharing Framework for Scientific Computing,” Grid Computing Environments Workshop, Reno, Nevada, USA., November 11-12, 2007. A. Zissimos et al., “GridTorrent: Optimizing Data Transfers in the Grid with Collaborative Sharing,” Proceedings of the 11th Panhellenic Conference on Informatics (PCI 2007), Patras, Greece, May 18-20, 2007. F. Costa et al., “Optimizing the Data Distribution Layer of BOINC with BitTorrent,” Proceedings of the 2008 IEEE International Parallel and Distributed Processing Symposium (IPDPS 2008), Miami, Florida, USA., April 14-18, 2008, pp. 1-8. D.P. Anderson, “BOINC: A System for Public-Resource Computing and Storage,” Proceedings of the fifth IEEE/ACM International Workshop on Grid Computing (GRID 2004), Pittsburgh, Pennsylvania, USA., November 8, 2004, pp. 4-10. B. Wei, G. Fedak, and F. Cappello, “Collaborative Data Distribution with Bittorrent for Computational Desktop Grids,” Proceedings of the 4th International Symposium on Parallel and Distributed Computing (ISPDC 2005), France, July 4-6, 2005. B. Wei, G. Fedak, and F. Cappello, “Towards Efficient Data Distribution on Computational Desktop Grids with BitTorrent,” Future Generation Computer Systems, Vol.23, No.8, November, 2007, pp. 983989. F. Cappello et al., “Computing on Large-Scale Distributed Systems: XtremWeb Architecture, Programming Models, Security, Tests and Convergence with Grid,” Future Generation Computer Systems, Vol.21, No.3, March 1, 2005, pp. 417-437. “Dropbox,” Available at https://www.dropbox.com/, 2013. “FUSE: Filesystem in Userspace,” Available at http://fuse.sourceforge.net/, 2012. “Apache ZooKeeper,” Available at http://zookeeper.apache.org/, 2012. “Big Buck Bunny,” Available at http://www.bigbuckbunny.org/, June 2010. “Blender,” Available at http://www.blender.org/, 2009. “DrQueue, the Open Source Distributed Render Queue,” Available at http://www.drqueue.org/, 2013. “Samba – Opening Windows to a Wider World,” Available at http://www.samba.org/, July 26, 2011. “SMB for Fuse,” Available at http://www.ricardis.tudelft.nl/~vincent/fusesmb/, 2007. “The Linux Kernel Archives,” Available at https://www.kernel.org/, 2007.

Suggest Documents