Optimizing Large File Transfer on Data Grid - Semantic Scholar

9 downloads 0 Views 144KB Size Report
transfer performance and minimize data loss. In this paper, we will introduce a new large file transfer tool based on QoS and multi-stream technology- BBFTP.
Optimizing Large File Transfer on Data Grid Teng Ma and Junzhou Luo Dept. of Computer Science and Engineering, Southeast University, 210096, Nanjing, P.R. China {mateng, jluo}@seu.edu.cn

Abstract. AMS02 science activities collaborated by Southeast university and CERN led by Samuel C.C. Ting are generating the unprecedented volume of data. The existing FTP has caused user to discomfort by delay and data loss according to network status. Therefore, it is necessary to efficiently raise transfer performance and minimize data loss. In this paper, we will introduce a new large file transfer tool based on QoS and multi-stream technology- BBFTP which is used in the AMS02 experiment. Through Monte Carol simulation data transfer experiments between SEU-SOC and ShanDong University node in ChinaGrid, we found the optimal parameters which help to promote BBFTP throughput performance. This paper focuses on related works about performance enhancement of large file transfer, which is critical to the performance of SOC (Science Operation Center).

1 Introduction AMS02 science activities are generating large amount of data including Monte Carol simulation. This kind of High Energy and Nuclear Physics E-science drives the development of ultra-scale networking, whose explorations at high energy frontier are breaking our understanding of the fundamental interactions, structure and symmetries that govern the nature of matter and space-time in our universe. The HENP project like AMS02 encompasses thousands of physicists from hundreds of universities and laboratories in more than 20 countries. Collaboration on this global scale would not have been attempted if the physicists could not count on excellent network performance. Rapid and reliable data transport, at speeds of 1 to 10 Gbps in the future, is a key enabler of the global collaborations in physics. This network characteristic of grid environment is not the same as the traditional serial line of dialup modems. The “bandwidth*delay product” is larger than traditional network. TCP performance depends not upon the transfer rate itself, but rather upon the product of the transfer rate and the round-trip delay. This “bandwidth*delay product” measures the amount of data that would fill the pipe; it is the buffer space required at sender and receiver to obtain maximum throughput on the TCP connection over the path, the amount of unacknowledged data that TCP must handle in order to keep the pipeline full. TCP performance problems arise when the “bandwidth*delay product” is large. We refer to an Internet path operating in this region as a “long fat pipe”, and a network containing this path as an “LFN”. In this paper, we will test BBFTP- a bulk data transfer tool- in this “LFN” environment and meanwhile find out optimal parameters for Monte Carol simulation dataset transferring.[5][7] [9] H. Zhuge and G.C. Fox (Eds.): GCC 2005, LNCS 3795, pp. 455 – 460, 2005. © Springer-Verlag Berlin Heidelberg 2005

456

T. Ma and J. Luo

In the remainder of the paper, we first introduce some theories and tools related with large file transfer in high performance computing. In the third section, we give out the performance optimization way. Through large file transfer experiments by BBFTP in section 4, we conclude that the bandwidth among ChinaGrid nodes meets the AMS need by using BBFTP.

2 Related Works To improve the performance in high bandwidth-delay product networks, congestion control keeping efficient, fair, scalable, and stable plays a key role. The easiest way to improve the performance is to open multiple TCP connections in parallel, while this approach leaves the parameter of the number of connections to be determined by the user, which may result in heavy congestion with too many connection numbers. There are several researches addressing this issue such as FAST TCP [8] [10], HighSpeed TCP [11] and BBFTP [1] [2] [3] [6]. FAST TCP [8] [10] is a new TCP congestion control algorithm for high-speed long-latency networks. The authors highlight the approach taken by FAST TCP to address the four difficulties, at both packet and flow levels, which the current TCP implementation has at large windows. They describe the architecture and characterize the equilibrium and stability properties of FAST TCP and meanwhile present experimental results comparing their Linux prototype with TCP Reno, HSTCP, and STCP in terms of throughput, fairness, stability, and responsiveness. FAST TCP aims to rapidly stabilize high-speed long-latency networks into steady, efficient and fair operating points, in dynamic sharing environments, and the preliminary results are promising. HighSpeed TCP [11] is an attempt to improve congestion control of TCP for large congestion windows with better flexibility, better scaling, better slow-start behavior, and competing more fairly with current TCP, keeping backward compatibility and incremental deployment. It modifies the TCP response function only for large congestion windows to reach high bandwidth reasonably quickly when in slow-start, and to reach high bandwidth without overly long delays when recovering from multiple retransmit timeouts, or when ramping-up from a period with small congestion windows. BBFTP [1] [2] [3] [6] is a file transfer system, developed by the High-Energy Physics community. BBFTP is a non-interactive FTP-like system that supports parallel TCP streams for data transfers, allowing it to achieve bandwidths that are greater than normal FTP. BBFTP first splits the file in several parts, and creates a parallel process for each part. Each process will send the data using its own socket. It implements its own transfer protocol, which is optimized for large files (larger than 2GB) and secure as it does not read the password in a file and encrypts the connection information.

3 Optimizing Way of Large File Transfer From the existed theories and tools, we found the main factors associated with the performance of large file transfer can be classified into three types: parallel stream number TCP Window size and compression algorithm.



Optimizing Large File Transfer on Data Grid

457

The easiest way to improve the performance of large file transfer is to open multiple TCP connections in parallel. While increasing parallel numbers may result in two problems: heavy congestion and CPU overload with too many connection numbers. Therefore there is a best value of parallel stream number for every environment. TCP window is the amount of outstanding (unacknowledged by the recipient) data a sender can send on a particular connection before it gets an acknowledgment back from the receiver that it has gotten some of it. Traditionally, TCP only has a 16-bit field for window size, which can specify 64kB at most. 64KB by default in most OS, is too little for high bandwidth E-Science computing environment. The window scale extension expands the definition of the TCP window to 32 bits and then uses a scale factor to carry this 32-bit value in the 16-bit Window field of the TCP header. TCP extensions for High Performance allow large windows outstanding packets for longdelay, high-bandwidth paths, by using a window scaling option and timestamps. By the extension, a window-scaling factor (0 to 14 bits) can be negotiated at connection setup of end points, with which window sizes of 1 Gigabytes can be represented. Compression algorithm is the same important for large file transfer such as Monte Carol dataset. In our transfer experiment, we used “on-the-fly” compression technology to zip the Monte Carol dataset during transfer. In following section , we will give out the experiment result by modifying the parameters associated with TCP Window parallel stream number and compression algorithm.



4 Data Transfer Experiment The transfer test is carried out between SDU HPC and SEU-SOC. The host “grid5.seu.edu.cn” is configured with Redhat Linux Enterprise 3 Intel Pentium 4 CPU RAM 768 MB 60 G Disk and 100M network control. The host “lcg.seu.edu.cn” in Shan Dong HPC is configured with Redhat Linux 7.3 Intel Xeon 2.8 GHz 1G RAM 60 G Disk and 1000M infineband. Both client and server of BBFTP 3.2.0 are installed in machines in SEU and SDU including Iperf (bandwidth test tool), Traceroute and Ping network measure tools.







、 、



4.1 TCP Window We transfer 20M Monte Carol dataset from grid5.seu.edu.cn to lcg.seu.edu.cn at each of the following TCP window sizes 128KB 256KB 512KB 1MB 2MB 3MB and 4MB. For each window size we used same number of parallel data streams to comprise each transfer. All the parallel stream numbers are 10. The gzip option is on. Each throughput is shown as Figure 1.











4.2 Multi-stream Transfer Whereas the normal FTP protocol implements one data connection only, BBFTP opens multiple data connections depending on the number of streams required (client option -p and server option -m). The server listens on ephemeral ports which are sent

458

T. Ma and J. Luo n i h t d i w d n a B e l b l i a v A (

800 600 ) s p 400 b K 200 0 128KB

256KB

512KB

1M

2M

3M

4M

(TCP window)

Fig. 1. Transfer throughput VS TCP window

to the client. These ports can be chosen in a range defined at compile-time or at runtime. The server initiates the data connection after the client has sent him its private IP address and the listening port. This port is ephemeral ports (>1023). (It is possible to define a range of ports when starting the client using the -D option) The server can then establish the connections to the specified ports from its port std-1 (i.e. 5020 by default). We transfer 20M Monte Carol dataset from grid5.seu.edu.cn to lcg.seu.edu.cn. Each time we change number of parallel streams. In the experiment, the window sizes of sender and receiver are both 64KB and the parallel stream number is from 1 to 13.

400 300 c e 200 s 100 0 1

2

3

4

5

6

7

8

9

10

11

12

13

Fig. 2. Transferring time VS Parallel stream number

Figure 2 shows the situation of transferring 20M Monte Carol dataset from grid5.seu.edu.cn to lcg.seu.edu.cn. From the Figure 2, with the increasing of parallel stream number, the transfer rate is linearly increased. But after the number reach 6 and 11, because of limit of system resource, the trend of variation has changed. It is proved that transfer rate can be advanced by promoting parallel stream number. 4.3 “On-the-fly” Compression BBFTP uses a technology “on–the-fly” data compression to zip the transferring data. This kind of compression is very suitable for Monte Carol dataset transferring.

Optimizing Large File Transfer on Data Grid

459

We transfer 20M Monte Carol dataset from grid5.seu.edu.cn to lcg.seu.edu.cn with and without compressing data. In the experiment, the window sizes of send and receive are both 64KB, and the parallel stream number is from 1 to 13. 600

gzip nongzip

500 400 300 200 100c e 0s 1

2

3

4

5

6

7

8

9

10

11

12

13

Fig. 3. Transferring with GZIP and without GZIP

Figure 3 shows the transferring 20M data with GZIP option and without GZIP. Y axis represents the parallel stream number. X axis represents the transfer time. “On the flight” packet compression during transfer has better performance than without GZIP. Then, we did a static compression experiment by RAR and GZIP compression tools. We found out that 20.9 MB Monte Carol dataset was compressed to 20.7MB size. It seems the static compression is not suitable for these datasets as a whole. “On the fly” compression during transfer after splitting into several parts sounds perfect. It greatly reduces the bandwidth consumption of Monte Carol dataset transferring.

5 Conclusions We presented the actual environment, performance and test results of Transferring Monte Carol dataset from SEU-SOC to SDU HPC by BBFTP. Concerning BBFTP testing, three things are worth to note by Monte Carol dataset transfer. First, because of the character of Monte Carol dataset, “on-the-fly” compression is very efficient and worthy of reducing the bandwidth consumption for AMS dataset transfer. Secondly, with parallel stream number increasing, transfer rate advances greatly. But there is a best parallel number for balance between transfer rate and CPU overload. This number varies with experiment environment and network status. According to the BBFTP test, the average bandwidth between SEU-SOC and SDU HPC is 3.43MB/sec for large files, so that the network performance reaches AMS need for collaboration by ChinaGrid nodes. Utilizing the unused computing resource on ChinaGrid is very important for E-Science collaboration. Acknowledgement. This work is supported by National Natural Science Foundation of China under the Special Program "Network based Science Activity Environment" (90412014).

460

T. Ma and J. Luo

References 1. A.Elin, A.Klimentov. Data transmission program for the AMS-02 ISS Mission, AMS Note 2001-11-02, Aug.14, 2002 2. M. Boschini , A. Favalli, M. Levtchenko. Data Transfer from Central Production Facility to Italian Ground Segment. A prototype.AMSnote-2003_10_01 3. A. Elin, A. Klimentov. Data transmission programs for the AMS-02 ISS Mission, AMS Note 2001-11-02 November 8, 2001 4. P.Fisher, A.Klimentov, A.Mujunen, J.Ritakari. AMS Ground Support Computers for ISS mission, AMS Note 2002-03-01, March 12, 2002. 5. Matthews, Cottrell. Achieving High Data Throughput in Research Networks. CHEP2001. 6. http://doc.in2p3.fr/bbftp/ 7. http://www-iepm.slac.stanford.edu/monitoring/bulk/ 8. Cheng Jin, David X.Wei, Steven H. Low. FAST TCP: Motivation, Architecture, Algorithms, Performance. http://netlab.caltech.edu/pub/papers/FAST-infocom20 04.pdf 9. Sylvain Ravot. TCP transfers over high latency/bandwidth network &Grid TCP. http://netlab.caltech.edu/FAST/meetings/2002july/GridTCP.ppt 10. Harvey B Newman. HENP Grids and Networks Global Virtual Organizations. http:// netlab.caltech.edu/FAST/meetings/2002july/HENPGridsNets_FAST070202.ppt 11. S.Floyd, in Internet draft, draft-floyd-tcp-highspeed-02.txt (2003), http://www.icir.org/ floyd/hstcp.html.

Suggest Documents