A Collaborative Fault-Tolerant Transfer Protocol for Replicated Data in the Cloud Nader Mohamed and Jameela Al-Jaroodi Faculty of Information Technology, UAEU Al Ain, P.O. Box 17551, UAE
[email protected] and
[email protected]
are executed in parallel and the correct result is selected from among the multiple results made available. In either case resources are wasted either by being kept idle waiting for a fault to be used or by being forced to repeat the same tasks multiple times.
Abstract—This paper proposes a collaborative fault-tolerant transfer protocol for replicated data available on the Cloud and the Grid. This technique utilizes the availability of replicated data on multiple servers to provide fault-tolerant data transfer as well as enhancing download times through the concurrent downloads of the requested data. While this technique provides fast and reliable file transfers, it does not impose extra communication and processing overhead compared to other concurrent or parallel data transfer techniques. The proposed technique allows multiple servers to collaborate in downloading the files, while it does not require run-time coordination among the servers. In addition, there is no need for periodic monitoring to discover server and network failures to achieve faulttolerance. Furthermore, the transfer operation will continue even if all servers except one fail. The proposed technique is most suitable for heterogeneous dynamic environments with varying network conditions and servers’ loads. The proposed technique has been implemented and evaluated and the results show considerable performance and reliability gains for data downloading compared to other approaches.
Parallel file download is used to speed up the download when multiple copies of the file requested are available. Replicated FTP servers are available over the Internet to supply users with files for different purposes such as getting new applications, upgrade components or important information. In addition, replicated FTP servers are available over Grid environments on which different Grid applications and users can download and utilize them. These files are usually very large and contain important scientific experiments and observations such as climate simulation data [4] and high-energy physics [6]. The Internet, Cloud and the Grid environments have characteristics distinct from other systems and they involve many components (hardware and software), which increases the possibility of failures.
Keywords— Fault Tolerance; Parallel File Transfer; FTP; Load Balancing
I.
Several methods and techniques were introduced to enhance download times using multiple servers and multiple clients. Many of these addressed the issue of load balancing among the servers and a few of these also addressed the issue of failures. However, the coverage does not offer efficient solutions for fault-tolerance. In a new work [1] DDFTP performs parallel file downloads from multiple FTP servers with inherent load balancing. Our technique (Fault-Tolerant DDFTP) extends DDFTP and offers collaborative faulttolerant parallel file downloads from replicated FTP servers. This Technique offers reliable and fast downloads with builtin adjustments to adapt to possible failures.
INTRODUCTION
Large files and huge data downloads take a lot of time and resources to complete and failures during the download operation could cause a lot of delays. Therefore, it is important to have some mechanisms that will allow for continuing the download in the presence of faults. The most common type of faults is the fail stop faults where one or more component of the system fails completely. Other types include fail benign, fail symmetric and fail asymmetric. In addition, faults could be permanent, transient or intermittent [20]. When dealing with file downloads, most of the possible failures involve hardware and system failures where the server, client and/or the network in between fail for some reason. Fault-tolerance is the ability to keep the system or application running despite the existence of one or more faults. This will require having mechanisms to detect faults, divert operations around it and make sure the results remain correct. Most fault-tolerance techniques rely on redundancy, where several copies of the system and application components are available and a failover is performed to shift the operations from the faulty component to a correct replica. Some techniques also use replication where multiple versions
978-1-4673-1382-7/12/$31.00 ©2012 IEEE
In this paper we cover some basic concepts on file downloads and the related work on load balancing and faulttolerance in parallel file transfer in Section II. In Section III we describe our proposed technique, fault-tolerant DDFTP and explain how fault-tolerance is achieved while minimizing overhead. We then evaluate the performance of our technique in Section IV. Finally we conclude the paper in Section V. II.
RRELATED WORK
The TCP protocol [18] [21] provides reliable ordered communication. Thus distributed applications can safely rely on TCP to receive ordered and correct data. In addition TCP
203
reliable transfers that can manage and recover from detected failures, yet it does not use parallel transfer.
has become the de facto standard; therefore, most standard FTP implementations use TCP as the underlying protocol due to its reliability features. The original FTP as described in RFC 959 [10] was designed to support file transfer using the client/server model and serving a single connection at a time. FTP includes a fault-tolerance feature that allows the manual restart of file transfer from the point of failure. However this was not included in many current FTP implementations. Several enhancements and extensions to the original FTP were made and published in different RFCs such as RFC 1579 [11], RFC 2640 [15] and RFC 3659 [7]. In addition, researchers and organizations are working on different implementations and/or versions of FTP implementations to serve specific types of environments or applications. Thus they provide specific optimizations that match their needs and offer better performance accordingly. For example, the file exchange protocol (FXP) [9] is designed to allow a client to transfer files between several servers without going through the client. This basically facilitates high bandwidth transfers among servers and reduces the overhead on the client. However, this was based on the single server single client model thus the enhancements were always bound by the available bandwidth between the servers and the client. In addition, fault-tolerance is only achievable by including checkpointing mechanisms such that the system can resume downloads from the failure point when the error is fixed and the failing component is recovered.
Another potential improvement was done as part of PFTP [5], where striping and parallel connections are used to move data from cluster to cluster. PFTP relies on the availability of a parallel file system such as PVFS (Parallel Virtual File System), which stores large files in trips across multiple cluster nodes. PFTP opens multiple connections from the source nodes to the destination nodes to transfer the partitions of the file in parallel. PFTP does not incur high overhead since most of the work is done by the PVFS, yet it cannot operate without it. In addition, this works in case we have multiple servers matched by a similar number of clients receiving the different parts simultaneously. Furthermore, the approach does not provision for fault-tolerance unless the PVFS does through some models like RAID level 5 of RAID level 6. A different research group introduced a different PFTP for high performance storage systems (HPSS) [13]. This version of PFTP allows for large file transfers (larger than 2 GB) over multiple client data ports. In addition, PFTP can be used to transfer portions of the file from different servers in parallel if replicas were available. Here as well, the model does not provide fault-tolerance thus server/network failures will cause the download to terminate. In [19] the authors present a parallel download scheme that guarantees that the file requested is received from the fastest available mirroring site. Although the idea is good, yet it does not make use of the parallel transfer efficiently since only one copy is taken. However, this model offers fault-tolerance since it can keep running even if all but one mirror site fail without the need to actually discover the failures beforehand.
To achieve better performance, researchers had to explore redundancy and parallelism at different levels to achieve better performance and to provision for fault-tolerance. Several ideas were explored and one of the earliest was the GridFTP [3], which extends the FTP by adding features for efficient mass data transfer using parallel TCP streams and data striping over multiple servers using multiple TCP streams to overcome the TCP buffer/window size limitation. GridFTP also implements the fault-tolerance feature in the FTP specifications, which allows for partial transfers starting from the point of failure based on the checkpointing mechanism. The performance of GridFTP was demonstrated for high volume data and file transfers in [2]. Several enhancements and variations of GridFTP were introduced such as [17], where a middleware framework (NaradaBrokering) is used to enhance the reliability and performance of GridFTP by separating the application and environment independent features. Another group developed a dynamic parallel data transfer from replicated servers using the GridFTP features [23]. In this model, the client specifies the file to download and an LDAP locates the replicas available. An algorithm is used to calculate the partition sizes to be retrieved from each replica based on location and current load conditions. During the download, periodic checks of the transfer conditions are done to adjust the partitions locations and sizes among the replicas, which add a high overhead, yet it does not offer fault-tolerance. In BitDew [8] the support for file transfer is offered through an abstract level over the Grid and Desktop Grids. The protocols involved (BitDew and Out of Bound Transfer) allow for
A different approach for parallel file transfer is to use a proxy that controls the process. This proxy could be either close to the client or the server depending on the functionalities needed. One example is discussed in [12], where a proxy handles the client request for file download, retrieves file blocks from various servers, orders them then delivers them to the client. This approach provides load balancing based on monitoring servers performance and relieves the client from this task. However, this requires a large buffer for ordering the blocks at the proxy. The approach includes a substituting download mechanism that allows for requesting another copy of a block if it does not arrive within a given amount of time. This reduces the buffer size and provides a crude level of fault-tolerance since the duplicate request will circumvent the delaying server which may have failed. To further reduce the buffer size variable block sizes is used [16], where each server is assigned a different size block depending on its perceived performance. This will make the overall transfer time from all servers very similar and reduce the need for long term buffering at the proxy; however, it does not offer fault-tolerance since there is no defined mechanism to recover when a block does not arrive at all. This method poses some overhead to discover the servers’ performance, assign blocks and handle control information. Also to enhance download times and reduce
204
DDFTP [1] relies on file partitioning and is based on the basic concept of no synchronization parallelization. In the case of two servers alternating block transfer from the beginning of the file will require continuous monitoring and control and may not achieve good load balancing. Instead, we commence the transfer from either end of the file and continue until the servers meet somewhere in the middle. Thus there will be no need for constant monitoring or reallocation of the load during the download. In addition, the DDFTP servers will not need to be aware of any synchronization or coordination requirements. Those will be present and executed locally on the DDFTP client alone. In this model we rely on the client to decide when to stop the servers from transmitting further blocks. In the original design the client asks the servers to stop transmission as soon as it observes that the next two blocks to receive have consecutive numbers, which means they are in transit and will arrive soon to complete the file. To account for missing blocks, we need to redesign the client to allow for some overlap in the block downloads to compensate for any loss or delays due to the failure of one of the servers or networks.
costs in the grid environments researchers proposed a dynamic server selection scheme coupled with P2P coallocation based on available bandwidth values between the client and the different servers [14]. In this case, an algorithm is used to select the best suited servers among those containing file replicas and dynamically change the block sizes among servers based on the bandwidth. In addition, substitution download is also used to re-order delayed blocks from other servers, which in turn offers some fault-tolerance. These techniques offer good load balancing, yet they all introduce high overhead and may face problems when loads and operating conditions vary irregularly. Another example for co-allocation is using anticipative recursively adjusting mechanism (ARAM) [22] to adjust block sizes based on anticipated bandwidth between the servers and the client. In this model parallel download enhances the performance and the method can adjust to avoid broken links or failed servers in between assignments. However, if a server or link fails during the transmission of a block, the method cannot recover the lost part of that block. In general, various concurrent and parallel techniques are used to enhance performance and provide clients with fast file downloads and several of those also offer fault-tolerance as part of the model. However, as in the general case of parallelism, adding more does not always result in a matching improvement level. In most cases the overhead imposed and the control issues may result in lower performance gains than expected and require much more efforts to accomplish than the normal methods. When fault-tolerance is introduced in many of the techniques it becomes a tradeoff between the performance and the overhead imposed to discover faults and recover from them. As a result many of the models we explored and others also we did not mention offer good enhancements but still afford more. In the following sections we will extend the DDFTP [1] technique for parallel file downloads that offers high performance while minimizing coordination and management overhead by introducing faulttolerance. In this technique we download file partitions from replicated servers and we rely on the characteristics of TCP [21] to help in the ordering and reconstruction of the file from the delivered blocks on the client side. III.
The overall benefit of this approach is that DDFTP distributes the download efforts thus reducing the restrictions imposed by the TCP flow and error controls and allowing the client to fully utilize whatever bandwidth available to it by accepting multiple flows at the same time. Simultaneously DDFTP makes use of the reliable in-order delivery features of TCP thus eliminating the need to add block number headers to help the client order the blocks. As we allow for overlap we also gain the benefit of fault-tolerance without having to spend effort on fault discovery or wait to discover the faults. A. The Dual Server Case The dual-source DDFTP is the basic form of the technique where we only have two replicated copies of the file to be downloaded. Here the client will first need to obtain the file information and the replica locations, which could be available in a file registry. With that information the client decides on the size of the blocks, which also defines the total number of blocks to download. The client controls the servers using two control messages. The Start message contains the fileName, which indicates the file to download; the blockSize, which indicates the size of each block; the firstBlock, which tells the server where to start the download; and the counterMode (increment or decrement), which indicates the direction of downloads. The End message only carries the fileName to tell the server to stop sending more blocks from that file. To begin the download, the client prepares and sends the two Start messages and when all blocks are received it will send two End messages.
FAULT-TOLERANT DUAL-DIRECTION FTP
FTP supports two main operations: GET and PUT. Files can be available in multiple FTP servers such that the GET operation can benefit from some form of concurrency. However, in most cases PUT operation is done from a single client to one FTP server. Therefore, the only method of parallelization is to use multiple streams which overcome some of the limitations of the communication protocols. Otherwise further optimizations require more complex approaches. In this paper, we describe the dual-direction FTP (DDFTP) technique with fault-tolerance to enhance download (GET) speed and reduce the concurrent transfer overhead. We will first explain how the technique works in the basic form using only two file replicas on two servers. Then we will extend it to cover any number of replicas on different servers.
The client will also keep a current status record (CSR) to keep track of the progress of each server. This record includes for each server the serverName; the LastBlock, which is the number of the last block received; the counterMode, which indicates the direction of download; the partnerName, which is the name of this server’s partner (the other server in the pair); totalBlocks, which is the total number of blocks
205
incoming to the client and the other server will continue normally. Therefore, the meeting point, m, will just shift to be closer to the failed server’s side and the download time will increase proportional to the number of remaining blocks after the failure (See Figure 2).
received from this server; receiveTime, which specifies the time when the last block was received; and the serverStatus which indicates that the server is available (null) or failed (one). A cutOff value can be set to help identify failed servers, which can either be fixed or dynamically adjusted during runtime to reflect current servers performance expectations. All servers start with serverStatus null indicating they are all available. The serverStatus is changed to one due to two possible cases: (1) when the time since receiving the last block has exceeded the cutOff time, or (2) when the network fails and the TCP connection is dropped. In the dual-server case the last two parameters are relatively unnecessary; however, in the k-server case, they will be important to maintain fault-tolerance with multiple failures.
The implementation of the DDFTP servers is relatively simple; each server is multithreaded to be able to handle multiple requests. In addition, each server will simply execute the Start and End messages received from the client following the attributes included in these messages. The DDFTP client is multithreaded such that it could manage the download from the two DDFTP servers and keep track of incoming blocks. It will prepare the Start messages and initialize the CSR for each server. During the download the client will receive blocks and update the CSRs, while keeping a lookout for the overlap in blocks received so it would prepare two End messages to send to the servers. The duplicate blocks arriving after that will only overwrite their earlier version. As a result, the client will ensure that it has received all blocks even in the presence of a failed server or network connection. In addition, there is no need to add any extra instructions or controls in the servers to compensate for the failures.
Both DDFTP servers after receiving the Start messages will concurrently process and transfer the requested file blocks from different directions as illustrated in Figure 1. To account for possible failures in the servers or their network connections, the client will wait until it can confirm receiving all required blocks. This may result in a slight overlap in the received blocks from the servers due to the delays experienced when sending the End message. However, the client can simply discard the extra copies of blocks already received. The technique ensures that the faster DDFTP server will transfer more blocks than the slower DDFTP server during the same time period. Thus, they will automatically achieve load balancing. In addition, if one of the two servers fails during the download process, the other one will automatically pickup the work and continue the download for the remaining parts of the file without even knowing about the failure. As a result there is no time wasted to discover the server failure and find an alternative for it as in many other fault-tolerance methods. Therefore, the value of m (the meeting point) depends on the load on both DDFTP servers and the point of failure (if any) for one of the servers.
B. The k-Server Case DDFTP in the dual-server case offers best possible load balancing in addition to fault-tolerance provided that at any point of time only one server fails. Here we extend the technique to multiple servers. The general technique will still maintain efficient load balancing and afford multiple failures (it will continue even if all but one server fail) among the available servers. The general approach applies to any number of servers; however, to simplify the explanation, we will deal with an even number of replicated servers. Having k DDFTP servers, where k is even, the servers are divided into k/2 pairs. In addition, the requested file is divided into n equal-sized blocks grouped into k/2 partitions as shown in Figure 3. If we have prior knowledge of the aggregate performance of each pair, we can choose the partition sizes to be proportional to their performance to start with a better balance among the pairs. Each pair will proceed as described for the dual-server case. However, the difference will occur when pairs start finishing their work at different times. During the transfer, any pair that finishes from their current partition is reassigned to help another pair still working on a
Since both DDFTP servers send blocks from different directions, there is no need for coordination between them. The TCP reliability mechanism ensures that there is no block loss or corruption during transmission, but it does not compensate for connection or server failures. In this case the client only needs to keep track of incoming blocks using the CSR and stop the servers when all blocks are received. This will automatically achieve fault-tolerance since any failure in one server will simply cause blocks from that server to stop DDFTP1 b1
DDFTP2 b2
bm
bm+1
bn-1
bn
Figure 1. Block processing and downloading directions by dual DDFTP Servers. DDFTP1 b1
DDFTP2 b2
bm
bn 1bm+1
bn
Figure 2. Block processing when DDFTP2 fails when sending block bn-1. DDFTP1 will send all blocks including bn-1 (which is bn-1 in this case)
206
DDFTP1
DDFTP2
DDFTP3
DDFTP4
Partition 1
DDFTPk-1
Partition 2
DDFTPk
Partition k/2
Figure 3. DDFTP k-servers technique.
partition. Therefore, if any server from one pair fails, the other one will continue sending blocks of the partitions until it reaches the point where the other server in the pair has stopped. Therefore, regardless where the failure occurred, we do not need to know about it, but we will just notice some degradation of the download performance for this pair. This degradation is directly proportional with the amount of remaining work to be done by the surviving server in the pair. However, the client, using the receiveTime value in the CSR can assume a failure if the time since the last received block is higher than the cutoff value. In that case it will change the serverStatus to one to indicate its failure. This information is used at the reassignment to decide what to do with the pair.
partition. A pair that completes its partition is a freePair while a pair that did not finish is a busyPair. To provide the help, the partition of the busyPair will be divided further into two partitions, leftPart and rightPart. And the servers of the freePair will be split such that each server will be paired with one of the servers in the busyPair. The new servers in each partition will start working from the new assigned points, while the original servers from the busyPair will just continue their original work without any interruption or changes (see Figure 4). This process is repeated until there are no more busyPairs. All this is controlled by the DDFTP client using the Start and End messages. When a partition is complete, the client sends two End messages to the servers which become a freePair telling them to stop. Then two Start messages are sent to this freePair’s servers to inform them of their new assignment (helping one of the busyPairs).
We can assume that there could be single or multiple failures in the system; therefore, we must compensate in different ways. If a failure occurs in one server in a pair, the other server will continue until the partition is finished. This will remain true if several servers, yet only one per pair fails. However, if a pair with a failed server (as indicated by the CSR) becomes a freePair while there are still some busyPairs, then we need to adjust the reassignment process. First, the client will try to restart the failed server, which could be possible if the fault was transient. If the server restarts, the client updates the serverStatus in the CSR and the pair is reassigned normally to help one of the busyPairs. If the failed server does not restart, then the client will split the active server to form a virtual pair and then reassign it to help a busyPair. However, if at the same time another freePair is available with a similar situation, the two active servers from both pairs are grouped to form a new freePair. The new
For the case of an odd number of DDFTP servers (k is odd), the same method is adjusted such that k-1 servers are organized in pairs and the last server spawns two threads and each is assigned a separate TCP connection such that the two threads will form a virtual pair. This pair is given a smaller partition to work on in the same manner as the other pairs. The reassignment steps are also adjusted accordingly in a minor way to be able to handle this virtual pair in case it completed its work before other pairs and needs to be reassigned. As in the dual-server case, fault-tolerance is achieved allowing the servers in a pair to slightly overlap their work such that the client does not issue the End messages to any pair until it has completely received all required blocks in the DDFTP1 1
DDFTP2 2
3
4
5
6
7
8
9
10
12
13
14
15
16
17
18
19
Partition 1
DDFTP3 21
11
22
23
24
25
26
27
28
29
30
31
20
DDFTP4 32
33
34
35
36
37
38
39
40
Partition 2 DDFTP1 1
DDFTP4 2
3
4
5
6
7
8
9
10
DDFTP3 11
12
DDFTP2 13
14
Partition 1a 21
22
23
24
25
15
16
17
18
19
20
37
38
39
40
Partition 1b 26
27
28
29
30
31
32
33
34
35
36
Figure 4. Top: a snapshot of 4 DDFTP servers processing 2 partitions, the 2nd partition is completed by DDFTP3 and DDFTP4 while the 1st partition is still being processed. Bottom: the reassignment of the free DDFTP Servers DDFTP3 and DDFTP4 to start helping DDFTP1 and DDFTP2.
207
freePair is then assigned to help a busyPair. The second possible failure case is when both servers in a pair fail. In this case the partition will not be completed and the pair will remain a busyPair until another pair becomes a freePair and is reassigned to help this pair. The reassignment of a freePair can be further optimized to minimize the effects of slow and failed servers as follows:
While the first server (FTP1) provides an effective bandwidth of 2.140603 MB/Sec and the second server (FTP2) provides an effective bandwidth of 2.853816 MB/Sec, DDFTP provides an effective bandwidth of 4.970722 MB/Sec. The efficiency of this parallelisation is 99.53% of the total capacities of both servers; while the parallelism efficiency is 85.63% for conFTP and 93.64% for DADTM. As it is clearly shown, DDFTP offers the best overall performance, which is mainly attributed to the inherent load balancing and minimal overhead imposed on the servers.
Slowest busyPair First: Here if multiple busyPairs are available, the freePair is assigned to help the slowest of those pairs. That is the pair that has the most blocks left in its partition. This will allow us to single out a busyPair where there is a possible failure either in one or both of its servers.
DownloadTime(insec.)
250
Opposites Attract: After selecting the busyPair to help, we pair the freePair servers with the busyPair servers based on their speeds such that the faster free server is paired with the slower busy server and vice versa. As a result, we attempt to balance the overall work load between the two pairs. This also helps compensate if one of the busyPair servers has failed as it will look slower and in need of more help.
150 100
50
0 FTP1
Uneven Partitions: After applying the first two strategies, we can now make the new partition sizes proportional to the calculated performance of the new pairs. Using the totalBlocks field from each server’s CSR, we can determine the overall performance of the new pairs for example if in one pair the total number of blocks processed was 20 and in the other it was 30, then the remaining blocks are partitioned on a 2:3 ratio and the larger partition is assigned to the faster pair. This will provide a better balance and allow us to compensate for possible failures as the slower pair which may have been slow due to a failure will be given a smaller partition. IV.
200
FTP2
ConFTP
DADTM
DDFTP
Figure 5. Performance of different file transfer approaches.
The second experiment was conducted to measure the performance of DDFTP with fault-tolerance and compare it with the performance of a modified version of DADTM with fault-tolerance. DADTM does not provide fault-tolerance by design. However, as DADTM provides good performance that is close to the performance of DDFTP [1], we added a fault-tolerance feature to DADTM such that the adaptation process takes into consideration the discovery of faults after each rescheduled monitoring round. Server permanent faults were created to the first server after 20, 40, 60, and 80 seconds from the beginning of the file transfer. We experimented with fault-tolerance on both LAN and wide area network (WAN) using a WAN emulator to mimic WAN features like high round trip time (RTT) delays. In the WAN experiment we used 0.240 second RTT (is close to the RTT for a signal traveling between USA and Europe). The results for a 500MB file transfer over the LAN is shown in Figure 6.
EXPERIMENTAL EVALUATION
This section focuses on the fault-tolerance feature of DDFTP and demonstrates the efficiency of the technique in terms of minimum overhead. Load balancing experiments and analysis of DDFTP in [1] offer more experiments showing the superior performance of DDFTP compared to some similar approaches. A. The Dual-Server Case The first experiment was done on a wired local-area network (LAN). The main purpose of this experiment is to get basic information about the performances of both servers as well as the performance of parallel file transfer using different approaches. The same servers and parallel file transfer approaches are used for measuring the performance with fault-tolerance in the following experiments. The three approaches used for parallel file transfer are: concurrent FTP (conFTP) that does not apply load balancing as it uses fixed equal sized partitions to download from each server, the dynamic adaptive data transfer model (DADTM) [23] that provides adaptive load balancing among the available servers and DDFTP. The experiment was conducted with the existence of other loads on the servers. This load levels were kept similar for all approaches used in this experiment. The results of transferring 500MBytes file are shown in Figure 5.
Figure 6. Fault-Tolerance Performance in LAN.
Here you notice a relatively constant advantage for DDFTP over DADTM since DDFTP does not have to wait to discover faults at all. Figure 7 shows the result when WAN is used, where the superiority of DDFTP becomes more obvious
208
with the existence of long delays. These delays make it harder for DADTM to reflect environment changes and faults quickly in the load balancing steps and also delay the discovery of faults. Therefore, the overall performance of DDFTP is better than DADTM in all cases.
B. The k-Server Case In all experiments here, 8 servers were used and the file size was 500MB. In the first experiment, the impact of multiple faulty servers was measured using DDFTP and DADTM. Server faults in up to half the number of servers were intentionally enforced after around half the no-fault download time has passed (that is the first half of the download is always fault free). An RTT delay of 240ms was used. The result is shown in Figure 9. Both DDFTP and DADTM can handle multiple faults; however, DDFTP provides better handling and better performance in all cases.
Figure 7. Fault-Tolerance Performance in WAN.
The third experiment using a 500MB file was conducted to further illustrate the effect of long RTT delays on the performance of both DDFTP and DADTM with the existence of faults. All experiments started with two servers and ended with one server. The server fault was intentionally created after around 40 seconds of starting the download process. The results are shown in Figure 8. As shown in the figure, DDFTP download time does not increase much as RTT increases. However, with DADTM the file download time is significantly increased as RTT increases. This is due to the adaptive algorithm used in DADTM which needs to collect information from the servers. This becomes harder to collect and takes longer as the RTT value increases. For example a faulty server or network will take more time to be discovered by the client with long RTT delays in the environment. This makes the adaptive algorithm less effective for load balancing as well as for fault discovery resolution. In DDFTP, both the load balancing and the fault-tolerance mechanisms are naturally inherited in the download process as both servers download from different ends. This dual-direction process eliminates the need for collecting information about the servers and their associated networks and also eliminates the need to continuously monitor the servers’ changes during the download time. Therefore, we can provide a fast faulttolerance mechanism that compensates for the faults without even having to discover them.
Figure 9. Impact of multiple server faults.
In another experiment, the impact of long RTT delays on the fault-tolerance feature was compared. Two faults were intentionally created in two servers after around half of the processing time of the no-fault case has passed. The result is shown in Figure 10. Like the dual-server case, the download time did not increase much in DDFTP compared to DADTM. This shows that DDFTP provides delay-tolerance for enforcing load balancing mechanism as well as for applying fault-tolerance. For example the performance of DDFTP degrades by only 3% as RTT starts from 100ms and reaches 500ms, while in DADTM it degrades by almost 11% with the same increase. This indicates that DDFTP is good for dynamic environments such as the Grid and Cloud where discovering changes in the environment takes a long time. Download Time (in sec.)
120
Download Time (in sec.)
160 158 156 154
115 110
DADTM DDFTP
105 100 95 90 85 80
152 150
100
DADTM DDFTP
148 146
200
V.
140 138 120
500
Figure 10. Impact of different RTT in download loads.
144 142
LAN
300 400 RTT (in msec.)
CONCLUSION
Parallel file transfer provides an efficient way to quickly get large files and it also opens up a good opportunity to include load balancing and fault-tolerance mechanisms to enhance the overall reliability of the system over
240
RTT (in m s e c.)
Figure 8. Impact of RTT in download with server faults.
209
Data: A Challenge Problem for Data Grid Technologies,” in proc. ACM/IEEE Conference on Supercomputing, November 2001.
heterogeneous environments. However, fault-tolerance is an intricate issue to deal with in distributed environments. There are several types of faults to look for and those could come in various durations such as transient and permanent failures. In most cases it may be possible to mask the faults and allow the system to continue with some level of degradation in the performance, while other cases require discovering the faults and recovering from them. Most approaches we investigated either pose long delays to get the missing parts after everyone is done, while many others impose high monitoring and checkpointing overhead to allow the system to roll back and restart in case of failures. DDFTP provides an excellent faulttolerance technique inherent from the parallel download method used. By allowing the download within any given partition to start from both ends at the same time, we automatically make sure that the load among the two servers at both ends is balanced and we also ensure that if one of the two servers fails, the other will automatically pick up the slack and finish the work without any interference. In addition, by enhancing the functionalities of the client, we allowed it to discover failures and try to correct them if possible. Otherwise it could easily go around them and allow the other servers to finish the download. The experiments have shown that the fault-tolerance mechanism imposes minimal overhead and the work is guaranteed to complete even if all but one server is available. To further enhance DDFTP, we will work on improving it and incorporating more advanced features to automate some of the steps such as server discovery and increase the usability and transparency of the operations. We will also investigate the PUT operation and develop an efficient approach using dual direction transfer to speedup file uploads and account for fault-tolerance as well. J. Al-Jaroodi and N. Mohamed, “DDFTP: Dual-Direction FTP,” in Proc. of 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid 2011), May 2011.
[2]
B. Allcock, J. Bester, J. Bresnahan, A.L. Chervenak, I. Foster, C. Kesselman, S. Meder, V. Nefedova, D. Quesnel, and S. Tuecke, “Data management and transfer in high-performance computational grid environments,” in Parallel Computing - Parallel data-intensive algorithms and applications. Vol. 28, No. 5, pp:749-771, May 2002.
[3]
[4]
D. Bhardwaj and R. Kumar, "A Parallel File Transfer Protocol for Clusters and Grid Systems," in proc. 1st International Conference on eScience and Grid Computing, 2005.
[6]
J. Bunn and H. Newman, “Data-intensive grids for high-energy physics, In Grid Computing: Making the Global Infrastructure a Reality,” Wiley, New York, USA, 2003.
[7]
Extensions to FTP, RFC 3659, http://tools.ietf.org/html/rfc3659.
[8]
G. Fedak, H. He and F. Cappello, “BitDew: A data management and distribution service with multi-protocol file transfer and metadata abstraction,” in The Journal of Network and Computer Applications, Vol. 32, No. 5, pp:961-975, September 2009.
[9]
File eXchange Protocol (FXP), viewed November http://en.wikipedia.org/wiki/File_eXchange_Protocol.
[10] File Transfer Protocol, RFC 959, http://www.faqs.org/rfcs/rfc959.html.
viewed
viewed
November
2010,
2010,
November
2010,
[11] Firewall-Friendly FTP, RFC 1579, viewed November http://www.networksorcery.com/enp/protocol/ftp.htm.
2010,
[12] J. Funasaka, A. Kawano and K. Ishida, "Implementation Issues of Parallel Downloading Methods for a Proxy System," in proc. 4th International Workshop on Assurance in Distributed Systems and Networks, ICDCSW'05, vol. 1, pp: 58 - 64, 2005. [13] HPSS User's Guide - High Performance Storage System, Release 7.1, IBM, USA, February 2009. [14] C-H. Hsu, C-W. Chu and C-H. Chou, "Bandwidth Sensitive Coallocation Scheme for Parallel Downloading in Data Grid," in proc. IEEE ISPA, pp: 34 - 39, 2009. [15] Internationalization of the File Transfer Protocol, RFC 2640, viewed November 2010, http://tools.ietf.org/html/rfc2640. [16] A. Kawano, J. Funasaka and K. Ishida, "Parallel Downloading Using Variable Length Blocks for Proxy Servers," in proc. 27th ICDCS Workshops, pp: 59, 2007. [17] SB. Lim,G. Fox, A. Kaplan, S. Pallickara and M. Pierce, “GridFTP and Parallel TCP Support in NaradaBrokering,” in Distributed and Parallel Computing, Lecture Notes in Computer Science, Vol. 3719/2005, pp: 93 - 102, 2005.
REFERENCES [1]
[5]
[18] L. Parziale, Britt, D.T., Davis, C., Forrester, J., Liu, W., Matthews, C., and Rosselot, N., “TCP/IP Tutorial and Technical Overview,” IBM: Internal Technical and Support Organization, 2006. [19] G. N. Rao and S. Nagaraj, “Client Level Framework for Parallel Downloading of Large File Systems,” in International Journal of Computer Applications, Vol. 3, No. 2, pp:0975–8887, June 2010. [20] R.W. Butler, “A Primer on Architectural Level Fault Tolerance,” Technical Memorandum, NASA/TM-2008-215108, February 2008. [21] Transmission Control Protocol, RFC http://www.faqs.org/rfcs/rfc793.html, viewed November 2010.
B. Allcock, J. Bester, J. Bresnahan, AL. Chervenak, C. Kesselman, S. Meder, V. Nefedova, D. Quesnel, S. Tuecke, I. Foster, “Secure, Efficient Data Transport and Replica Management for HighPerformance Data-Intensive Computing,” in proc. IEEE 18th Symposium on Mass Storage Systems and Technologies, San Diego, CA, USA, April 2001.
793,
[22] C.T. Yang, M.F. Yang and W.C. Chiang, “Enhancement of anticipative recursively adjusting mechanism for redundant parallel file transfer in data grids,” in The Journal of Network and Computer Applications, Vol. 32 No. 4, pp:834-845, July 2009. [23] Q. Zhang and Z. Li, “Data Transfer Based on Multiple Replicas in the Grid Environment,” in proc. 5th Annual ChinaGrid Conference, China, pp:240-244, July 2010.
B. Allcock, I. Foster, V. Nefedova, A. Chervenak, E. Deelman, C. Kesselman, J. Lee, A. Sim, A. Shoshani, B. Drach, D. Williams, “High-Performance Remote Access to Climate Simulation
210