A Dynamic Weighted Data Replication Strategy in Data Grids

3 downloads 2759 Views 338KB Size Report
copies and grid sites for replication. By setting ... services in a data grid, which are GridFTP and Replica ..... wp2.web.cern.ch/edg-wp2/optimization/optorsim.html.
A Dynamic Weighted Data Replication Strategy in Data Grids Ruay-Shiung Chang, Hui-Ping Chang, Yun-Ting Wang Department of Computer Science and Information Engineering, National Dong Hwa University, Shoufeng, Hualien 974, TAIWAN Email:[email protected] Abstract Data grids deal with a huge amount of data regularly. It is a fundamental challenge to ensure efficient accesses to such widely distributed data sets. Creating replicas to a suitable site by data replication strategy can increase the system performance. It shortens the data access time and reduces bandwidth consumption. In this paper, a dynamic data replication mechanism is proposed, which is called Latest Access Largest Weight (LALW). LALW selects a popular file for replication and calculates a suitable number of copies and grid sites for replication. By setting a different weight for each data access record, the importance of each record is differentiated. The data access records in the nearer past have higher weight. It indicates that these records have higher value of references. In other words, the data access records in the long past have lower reference values. A Grid simulator OptorSim is used to evaluate the performance of this dynamic replication strategy. The simulation results show that LAHW successfully increases the effective network usage. It means that the LALW replication strategy can find out a popular file and replicates it to a suitable site.

Keywords - Data Grids, Data Replication, Load Balance

1. Introduction The volume of interesting data is measured in terabytes and will soon become petabytes because the development of technical and the ability of research are growing fast [2]. For example, biologists have gathered thousands of molecular structure in order to understand the operation of molecules and proteins, and there are petabytes data generated in domains of high energy physics to keep track of the collection results of the particles, etc. The demands of management of the huge distributed and shared data resources become more and

more important. The Data Grid is a solution for the above problem. A data grid [2,7,8] is composed of a large number of distributed computation and storage resources to facilitate the management of the huge distributed and sharing data resources efficiently. It is a serious challenge to ensure efficient access to such huge and widely distributed data in a data grid. Replication is a common method used to improve the performance of data access in distributed systems. Creating replicas not only reduces the bandwidth consumption but also reduces the access latency. In other words, increasing the data read performance from the perspective of clients is the main purpose of the aata replication algorithm. There are two basic data management services in a data grid, which are GridFTP and Replica Management. GridFTP is an extension of normal FTP, which provides efficient data transfer and access to large files. Replica Management is a mechanism for creating and managing multiple copies of files. Replica management services include creating new replica(s), registering these new replica(s) in a Replica Catalog and querying the catalog to find the requested replica(s). The replication mechanism is divided into three important subjects: which file should be replicated, when to perform replication, and where the new replicas should be placed. Usually, replication from the server to the client is triggered when the popularity of a file passes a threshold and the client site is chosen either randomly or by selecting the least loaded site. In this paper, a dynamic replication algorithm is proposed. It is believed that the popular files in the past will be accessed more than the others in the future. This is called temporal locality. With the property of temporal locality, a popular data file is determined by analyzing the number of access to the data files from users. After finding the best popular file, we trace to the client that generated the most requests for the popular data file and a new replica is placed in it. Therefore, we have to collect histories of records about the end-to-end data transfers to determine which file should be replicated. The popularity of a file can be

deduced from its access rate from the clients. In other words, the popular data files are identified by analyzing the access histories. Basically, the recent records are more suitable to use in analyzing the popularity of a file. However, the previous records are still worthy of reference. In this paper, we set different weight for the records according to their lifetimes in the system in order to find the recently potential popular files. By using different weights for records, we devise a method to determine which files are more popular and hence should be replicated. This method is called Latest Access Largest Weight (LALW) dynamic replication strategy. The paper is structured as follows. Section 2 gives a brief introduction for previous work on grid data replication. Section 3 describes our dynamic replication strategy in detail, and the performance evaluation is presented in Section 4. Finally, we summarize some conclusions in Section 5.

2. Related Work The drawbacks of transferring a file from one site to another in real time are bandwidth consumption and access delay. Data replication is a common method to reduce bandwidth consumption and access latency by replicating one or more files to other sites. Two kinds of replication methods are possible. The static replication creates and manages the replicas manually. In other words, static replication will not change with the changes in user behavior. The dynamic replication changes the location of replicas and creates new replicas to other sites automatically. Judging from the fact that resources or data files are always changing, it is clear that dynamic replication is more appropriate for the Data Grid. In this section, we introduce several dynamic replication strategies [9, 10, 11] proposed in the past. There are six different replication strategies for three different kinds of access patterns proposed in [9]. The strategies are No Replication or Caching, Best Client, Cascading Replication, Plain Caching, Caching plus Cascading Replication, and Fast Spread. These strategies are evaluated with three different data patterns described below: (1) Random access pattern, which has no locality in patterns. (2)Data contain a small amount of temporal locality. That is, some accessed files are likely to be accessed again. (3) Data contain a small amount of geographical and temporal locality. Geographical locality indicates that files recently accessed by a client are likely to be accessed by nearby clients.

Caching is different from replication in these studies. Replication was executed by server side and caching was by client side. A sever decides when and where to create a replica randomly or by recording client behavior. A client requests a file and stores its replica locally for future use. These strategies will be introduced briefly as follows. (1) No replication or caching: This is a basic case which is used to contrast with the various strategies. (2) Best Client: The best client is the node that has generated the most requests for the file. A replica of the file is located at the best client when the number of access for the file exceeds the threshold. (3) Cascading Replication: This strategy supports the tree architecture. The original data files are stored at the root. Once the number of access for a file exceeds the threshold of the root, a replica is created at the next level which is on the path to the best client. (4) Plain Caching: The client that requests the file stores a replica of the file locally. (5) Caching plus Cascading Replication: This combines Plain caching and Cascading Replication strategy. (6) Fast Spread: Follow the root to the client, a replica of the file is stored at each node. By the results of simulation, matching replica strategy with suitable access pattern would save bandwidth and reduce access latency. To summarize, Fast Spread has the best performance in the Random access pattern, and it can save the most bandwidth. In the geographical locality pattern, Cascading can have best performance in response time. The multi-tier hierarchical Data Grid architecture supports an efficient method for sharing of data, computational and other resources. In [10], two dynamic replication algorithms had been proposed for the multi-tier Data Grid, which are Simple Bottom-Up (SBU) and Aggregate Bottom-Up (ABU). The basic concept of SBU is to create the replicas as close as possible to the clients that request the data files with high rates that exceed the pre-defined threshold. If the number of requests for file f exceeds the threshold and f has existed in the parent node of the client which has the highest request rate, then there is no need to replicate. In contrast, if f does not exist in the parent node of the client and the parent node has enough available space for replica f, then replicate. The SBU replication algorithm has the disadvantage that it does not consider the relations among these historical records but processes the records individually. Therefore, the ABU is to aggregate the historical records to the upper tier until it reaches the root. The aggregated method adds the number of accesses for the records that access the same file and have the same parent node. After aggregation, the parent node id will

replace the node id for the same parent node in the records. With the exception of aggregation step, all concepts are similar to the SBU replication algorithm. A centralized dynamic replication mechanism was proposed in [11]. It determines the popular file the same as ABU does by analyzing the data access history. In addition, a special idea was proposed to find out the average of all number of accesses for records in the history table. It is called NOA , which acts as the threshold to select the popular data files.

NOA =

1 ∑ NOA(h) | H | h∈H

where |H| indicates the number of data files that has been requested. NOA(h) is the content of hth record in the history table H, and NOA field represents the number of accesses for a file. Only the files with NOA exceeding NOA will be replicated. After finding the NOA , the historical records whose NOV values are less than NOA will be removed. We then sort the remaining history records in descending order. The first record is the popular file after ordering.

3. LALW Dynamic Replication Algorithm 3.1 The Hierarchical Architecture In this paper, we propose a hierarchical architecture supporting our dynamic replication mechanism. The design of the architecture is based on a centralized data replication management. There is a Dynamic Replication Policymaker (Policymaker) responsible for replica management. Figure 1 shows the hierarchical architecture. We regard the grid sites as a cluster. There is a cluster Header used to manage the site information in a cluster. The Policymaker collects the information about accessed files from all headers. Each site maintains a detailed record for each file. The record is stored in the form of , which indicates the file FileId has been accessed by a site located in the cluster ClusterId at timestamp. Regularly, each site sends its records to the cluster Header. All records in the same cluster will be aggregated and summarized by the cluster Header. The format of a record in cluster Header is , which indicates that the file with

Figure 1: The Hierarchical Architecture FileId has been accessed Number times by the cluster ClusterId. After summarizing the records by a cluster Header, the information about FileId and Number is sent to the Policymaker. Therefore, there is a table which has two records in the Policymaker. For example, and indicate that the file A and file B have been accessed for ten times and seven times respectively. When the Policymaker has gathered all of information in different clusters, the popular file would be selected according to the number of accesses for files and weight of records. Although a cluster Header and the Policymaker are responsible for managing the access history, the information maintained in a cluster Header is local and global information is maintained in the Policymaker. There is a special case needed to process for the aggregation of the records. That is, the number of file

Figure 2: An example for aggregation of record

accesses should be summed up for the same file ID. For example, if there are two records in a cluster Header, and , the result after aggregating is . The result is then sent to the Policymaker. Figure 2 is an example to explain the processing of records from a cluster Header to the Policymaker.

and ai indicates the number of accesses for the file i at time interval j. The AF for file f is represented as:

3.2 Latest-Access-Largest-Weight (LALW) Algorithm

For instance, a file X has been accessed 5 times and 10 time in the first time interval and the second time interval respectively. Then AF(X) is (5 × 2-1) + (10 × 20). AF puts different weights for access records of different time intervals. According to Equation (1) we calculate the AFs of all files that have been requested. A file with the largest AF is chosen as the popular file. At the first phase we have found which file is popular by calculating the access frequency. In the second phase, we compare the average AF per time interval for the popular file (assuming it is p) with all other files in F. The average AF per time interval for the popular file and all files in F are represented as:

In constant time interval, the Policymaker gets the information for the files from all cluster Headers. Information gathered at different time intervals has different weights in order to distinguish the importance between history records. The rule of setting weight uses the concept of Half-life which is mentioned in many domains, such as physics, chemistry, and medicine. Half-life indicates the time required for the quantity to decay to half of the initial value, where the quantity may be radioactive element or chemical element. In our algorithm, the weight represents the quantity, and a time interval represents the time for half-life. That is, the weight of the records in an interval decays to half of its previous weight. Setting different weight is used to evaluate the importance for history records. Older history records have smaller weights. It means that the recent history tables are worthier for referencing than previous. Figure 3 indicates the concept. At the first time interval, there is only a table in the Policymaker, which is T1. The weight of records in table 1 is 20. At the second time interval, there are T1 and T2 in the Policymaker. Because T1 has existed in the Policymaker for a time interval, the weight of T1 becomes 2-1 and the weight of T2 is 20. We can derive the weight of records in different table at the nth time interval from the above rule. The weight of T1, T2, …, and Tn is 2-(n-1), 2-(n-2), …2-(n-n). This weight is used to find a more popular file, as explained in the following. The Latest-Access-Largest-Weight (LALW) dynamic replication strategy has three phases. First, finding which file is more popular. Second, we calculate how many replicas need to be created. Third, decide where the new replicas should be located. The LALW algorithm performs in the end of each interval of T seconds, where T is a system parameter. In the first phase, the Policymaker collects all access records from all cluster Headers. Then we define an Access Frequency (AF) to exhibit the importance for access history in different time intervals. Assume NT is the number of time intervals passed, F is the set of files that have been requested,

j

NT

AF ( f ) = ∑ (a tf × 2− ( N T −t ) ),∀ f ∈ F ...............(1) t =1

AFavg ( p ) = AFavg ( f ) =

AF ( p ) ...............(2) NT

[ AF ( f )]sum , ∀ f ∈ F............(3) N F × NT

where AF(p) is the AF for the popular file p, NT is the number of time intervals passed, NF=|F| is the number of different files that have been requested, and [AF(f)]sum indicates the sum of AF for all files requested. Then, we calculate the number of replicas needed for the popular file in order to achieve a load balance. Load balancing will try to distribute traffic efficiently among sites so that no individual site is overloaded. By calculating the quotient of AFavg (p) divided by AFavg(f), we would get the number of replica that need be created. The number is represented as (4).

 AF ( p)  Numsystem ( p ) =  avg ............(4)  AFavg ( f )  Let we use Figure 4 as an example to explain the first phase and the second phase. At the first time interval, file A is a popular file. Numsystem(p) is 2 which indicates that only one replica of file A needs to be created. At the second time interval, the popular file becomes B, and Numsystem(p) is 2. So we will create a replica of file B at the second time interval.

number of replicas to be placed at cluster c is calculated by (5).

 AFc ( p )  Numc(p) =  n ×  , c = 1, 2, ..., N .............(5)  [ AFc ( p )]sum 

Figure 3: The diagram of half-life for the weight in different time interval

Figure 4: An example for LALW replication mechanism phase 1 and phase 2 We have found out a popular file and estimate the number of replicas needed at phase 1 and phase 2. Finally, the third phase of LALW dynamic replication algorithm is the placement of new replicas. The Policymaker will make a request to every cluster Header in order to get the information about the popular file. The information contains ClusterId field and Number field. These records still have different weights according to different time interval. The Policymaker calculates AF(p) in different clusters after getting the information for popular file from cluster Headers. Then it sorts these records based on AF(p) in descending order. The number of replicas to be replicated is related to the ratio of single AF(p) to the sum of all AF(p)s in different clusters. The cluster in the first item of the sorted list has the highest priority to be placed replicas. Assume there are n replicas needed to be replicated to achieve the system load balance. AFc(p) indicates the AF for he popular file p in cluster c, [AFc(p)]sum is the sum of AFc(p) for all clusters. Without loss of generality, assume the sorted cluster sequence is numbered 1, 2, ,…, N. Then the

The replicas are replicated to cluster 1, 2, …, in turn until all the needed replicas are allocated. Combining Figure 2 and Figure 4, the records for popular file A are and < C2, 10> in the first time interval, and there are only one replica of file A that needs to be created. By Equation (5) we will create a replica of file A to Cluster 3. The illustration of phase 3 is shown in Figure 5, which is an extension from Figure 4. In this scheme, a cluster may be distributed with one or more replicas. The cluster Header then decides where to put the replicas to sites it manages. In other words, the replicas of the same file may be located at the same cluster many times. Before replicating, the Policymaker will query the cluster Header to know how many replicas it already has. If the number of replicas for a popular file in a cluster is less than the number decided by the Policymaker, then the number of replica needed in a cluster will be copied. On the other hand, the Policymaker does nothing and the extra replicas will be deleted by the Least Frequently Used (LRU) algorithm.

Figure 5: An example for LALW dynamic replication algorithm

4. Simulations

4.2. Simulation Parameters

We use the Grid simulator OptorSim to evaluate the performance for our LAHW Dynamic Replication strategy. OptorSim provides a modular framework to simulate the real Data Grid environment. Using the basic modules we can compare the effectiveness of different replica optimization algorithms within such an environment [1]. OptorSim is written in Java. It is developed by the European DataGrid project [12].

The topology of our simulated platform is shown in Figure 6. There are four clusters and each one has three sites. Nodes 8 have the most capacity in order to hold all the master files at the beginning of the simulation. The others have a uniform size, 50GB. All the network bandwidth is set as 100 Mbits/sec. The storage space at each site is 50GB. The connection bandwidth is 100 Mbps. There are 500 jobs with 6 different access patterns. Each data file to be accessed is 1 Gbyte. In order to simplify the requirements, we do not consider the consistency of replicas. Data replication strategies commonly assume that the data is read-only in Data Grid environments [8]. We ran the simulation with 500 jobs. A job is submitted to Resource Broker every 25 second. Resource Broker then submits to Computing Element according to QAC scheduling. There are 6 job types, and each job type requires specific files for execution. The order of files accessed in a job is sequential and is set in the job configuration file. The number of files in our simulation is 150, and a file size is 1GB. To demonstrate the advantages of the dynamic replica algorithm, LALW will be compared with Simple Optimizer and LFU (Least Frequently Used). The Simple Optimizer is a base case, which no replication and files are accessed remotely. The LFU algorithm always replicates, deleting those files least frequently accessed if the space of SE is not enough for replication.

4.1. Elements of Grid Simulation In order to achieve a realistic simulated environment, there are a number of elements which are included in OptorSim. These include Computing Elements (CEs), Storage Elements (SEs), Resource Broker (RB), Replica Manager (RM), and Replica Optimiser (RO). Each site consists of zero or more CEs and zero or more SEs. There are two configuration files to control various inputs to OptorSim. One is the grid configuration file which specifies the Grid topology and the contents of each site (the number of SEs and CEs) by user. The other one is job configuration file which contains information for the simulated jobs. Moreover, the job configuration file can specify the jobs each site will accept. These configuration files are read at the start of the simulation. In addition, there is a parameter file, which is used to set many simulation parameters. For example, the scheduling algorithm for the RB, the replication algorithm for the RO, the file access pattern for running job, initial file distribution, and so on [4]. The choices of the scheduling algorithms for the Resource Broker are Random, Queue Size, Access Cost, and Queue Access Cost (QAC) [5, 6]. (1) Random scheduling submits jobs randomly to any Computing Element that will run the job. (2) Queue Size scheduling submits jobs to the Computing Element which has the less number of jobs waiting in the queue. (3) Access Cost scheduling estimates the current network status. Jobs are submitted to the Computing Element which has the smallest cost for access all the files required for the job. The estimated cost is in terms of network latencies. (4) Queue Access Cost scheduling takes account of the access cost for the files and the cost for all jobs in the queue at each Computing Element.

Figure 6: Topology of our simulation

4.3. Simulation Results The mean job execution time for various strategies is shown in Figure 7. Mean job execution is about 15% faster using LALW than Simple Optimizer. However, LFU and LALW show similar performance. The number of replicas in data grid for LFU is more than for LALW. The LALW replication regularly, but LFU algorithm always replicates. LFU algorithm has advantage in terms of hit ratio locally. Nevertheless, it indicates that the LALW, though replicates less, has similar performances as LFU. An evaluation metric Effective Network Usage (ENU) in OptorSim is used to estimate the efficiency of network resource usage. It is defined as follows [6, 3]:

ENU =

Figure 7: Mean job execution time for Queue Access Cost scheduling and replication algorithms

Nremote file accesses + Nfile replication Nfile access

where Nremote file accessed is the number of accesses that CE reads a file from a remote site, Nfile replication is the total number of file replication occurs, and Nfile accessis the number of times that CE reads a file from a remote site or reads a file locally. A lower value indicates that the utilization of network bandwidth is more efficient. Figure 8 depicts the comparison of three replication strategy for ENU. Simple Optimizer has 100% ENU because CEs read all files from remote sites. The LALW and LFU algorithm can improve ENU about 40%~50%. Moreover, the ENU of LALW is lower by 16% compared to LFU algorithm. The reason is that LFU algorithm always replicates, so that the large value of Nfile replication will increase the ENU. In contrast, LALW pre-replicates files that may be accessed by next job to the SE of site regularly. The LALW surpassed LFU algorithm in ENU. Figure 9 illustrates storage resource usage, which is the percentage of available spaces that are used. It depends on the number of replica, thus, the storage resource usage of LFU algorithm is higher than Simple Optimizer and LALW. Simple Optimizer has the smallest value because it always read requested file remotely. It increases the job execution time and wastes the bandwidth resource. For the reasons mentioned above, the LALW has nice performance because it can save storage resource and has small job execution time.

Figure 8: Effective Network Usage

Figure 9: Storage Resources Usage

5. Conclusions and Future Work In this paper, we propose a dynamic replication strategy called Latest Access Largest Weight. At intervals, the dynamic replication algorithm collects the data access history, which contains file name, the number of requests for file, and the sources that each request came from. Moreover, these history tables are given different weight according to their ages. By calculating the product of weight and the number of accesses for a file in different tables, we have more precise metric to find out a popular file for replication. According to access frequencies for all files that have been requested, a popular file is found and replicated to suitable sites to achieve system load balance. In order to evaluate the performance of our dynamic replication strategy, we use the Grid simulator OptorSim to simulate a realistic Grid environment. The simulation results show that the average job execution time of LALW is similar to LFU optimizer, but excels in terms of Effective Network Usage. For the future work, we will try to further reduce the job execution time. There are two factors to be considered. One is the length of a time interval. If the length is too short, the information about data access history is not enough. On the contrary, the information could be overdue and useless if the length is too long. The other one is the base of exponential decay. If the base is larger than 1 but smaller than 2, the declensional rate of weight will be slower. Then information about data access history will be more important in contributing to find the popular files. Acknowledgements: This research is supported in part by ROC NSC under contract numbers NSC952422-H-259-001 and NSC94-2213-E-259-005.

References [1] W. H. Bell, D. G. Cameron, L. Capozza, P. Millar, K. Stockinger, and F. Zini, “OptorSim - A Grid Simulator for Studying Dynamic Data Replication Strategies,” International Journal of High Performance Computing Applications, Vol. 17, no. 4, pages 403-416, 2003.

[2] A. Chervenak, I. Foster, C. Kesselman, C. Salisbury, and S. Tuecke, “The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Datasets,” Journal of Network and Computer Application, vol. 23, pages 187-200, 2000. [3] D. G. Cameron, R. C. Schiaffino, P. Millar, C. Nicholson, K. Stockinger, and F. Zini, “UK Grid Simulation with OptorSim,” e-Science All-Hands Meeting, Nottingham, UK, September 2003. [4] D. G. Cameron, R. C. Schiaffino, P. Millar, C. Nicholson, K. Stockinger, and F. Zini, “OptorSim: A Grid Simulator for Replica Optimisation,” UK e-Science All Hands Conference 31 August - 3 September 2004. [5] D. G. Cameron, R. C. Schiaffino, P. Millar, C. Nicholson, K. Stockinger, and F. Zini, “Evaluating Scheduling and Replica Optimization Strategies in OptorSim,” Proceeding of 4rd International Workshop on Grid Computing (Grid2003), Phoenix, USA, November 2002. [6] D. G. Cameron, R. C. Schiaffino, J. Ferguson, P. Millar, C. Nicholson, K. Stockinger, and F. Zini, “OptorSim v2.0 Installation and User Guide,” November 2004. http://edgwp2.web.cern.ch/edg-wp2/optimization/optorsim.html [7] I, Foster, “Globus Toolkit Version 4: Software for Service-Oriented Systems,” IFIP International Conference on Network and Parallel Computing, Springer-Verlag LNCS 3779, pp 2-13, 2005. [8] W. Hoschek, F. J. Jaen-Martinez, A. Samar, H. Stockinger, K. Stockinger, “Data management in an international data grid project,” Proceedings of the First IEEE/ACM International Workshop on Grid Computing(GRID '00), Lecture Notes in Computer Science, vol. 1971, pages 77-90, Bangalore, India, December 2000. [9] K. Ranganathan, and I. Foster, “Identifying Dynamic Replication Strategies for a High-Performance Data Grids,” Proceeding of 3rd IEEE/ACM International Workshop on Grid Computing, vol. 2242 of Lecture Notes on Computer Science, pages 75-86, Denver, USA, November 2002. [10] M. Tang, B.-S. Lee, C.-K. Yeo, and X. Tang, “Dynamic replication algorithms for the multi-tier Data Grid,” Future Generation Computer Systems, vol. 21, page 775-790, May 2005. [11] M. Tang, B.-S. Lee, X. Tang, and C.-K. Yeo, “The impact of data replication of job scheduling performance in the Data Grid,” Future Generation Computer Systems, vol. 22, page 254-268, February 2006. [12] The European Data Grid Project, http://eudatagrid.web.cern.ch/eu-datagrid/

Suggest Documents