An Adaptive Execution Scheme for Achieving ... - Springer Link

5 downloads 1758 Views 697KB Size Report
May 28, 2009 - achieving guaranteed performance on the basis of the SLAs. The scheme ... is deployed within the framework that engages a ...... Thus, with high-speed network support (e.g., Java ... The Grid test bed is built with Globus. Fig.
J Grid Computing (2010) 8:109–131 DOI 10.1007/s10723-009-9120-9

An Adaptive Execution Scheme for Achieving Guaranteed Performance in Computational Grids Ajanta De Sarkar · Sarbani Roy · Dibyajyoti Ghosh · Rupam Mukhopadhyay · Nandini Mukherjee

Received: 7 November 2008 / Accepted: 27 April 2009 / Published online: 28 May 2009 © Springer Science + Business Media B.V. 2009

Abstract Nature of the resource pool in a Grid environment is heterogeneous and dynamic. Availability, load and status of the resources may change at the time of execution of an application. Therefore, in order to maintain the performance guarantee (as has been agreed upon through service level agreements (SLAs) between the client and the resource providers), an application may need to adapt to its run-time environment on the basis of resource availability and application demands. Often it may be required to migrate the application components to a new set of resources during their execution so that performance guar-

antee can be maintained. Objective of this paper is to present an adaptive execution scheme for achieving guaranteed performance on the basis of the SLAs. The scheme has been implemented based on the notion of performance properties and by deploying a set of autonomous agents within an integrated performance-based resource management framework. Keywords Adaptive execution · Performance properties · Local tuning · Migration and Grid

1 Introduction A. De Sarkar (B) Department of Computer Science and Engineering, Birla Institue of Technology, Mesra, Kolkata Campus, Kolkata 700 107, India e-mail: [email protected] S. Roy · D. Ghosh · R. Mukhopadhyay · N. Mukherjee Department of Computer Science and Engineering, Jadavpur University, Kolkata 700 032, India S. Roy e-mail: [email protected] D. Ghosh e-mail: [email protected] R. Mukhopadhyay e-mail: [email protected] N. Mukherjee e-mail: [email protected]

In a computational Grid environment, status and load of the resources change frequently and the user (or the client of the computational services) has no control over them. Therefore, maintaining Quality of Services (QoS) guarantee for the submitted jobs is a major challenge. A resource manager is deployed and assigned with the responsibility of mapping the jobs onto appropriate resources so that the QoS level as desired by the client be attained. However, in a dynamic and complex environment like Grid, this is a nontrivial task. Resource management in Grid, therefore, must be supported by regular performance analysis of the jobs. Often performance models are used to estimate the capabilities of the resource set and computational jobs are mapped

110

onto the resource set based on this model. But, such models generally fail because of the frequently changing resource loads and resource usage scenarios and the volatility of the resource pool (which resources may join and leave frequently). Initial mapping of the jobs onto a set of selected resources may therefore need to be reviewed and changed during run-time. Jobs may be tuned to adapt to certain changes in the environment or may be migrated to other resources, thereby addressing any kind of performance problem during run-time. This paper presents an integrated framework for performance analysis and resource management. The framework supports adaptive execution of components (hereafter referred as jobs) of an application, and thus endeavors to maintain guaranteed performance of it as has been agreed upon through service level agreements (SLAs). It also supports adaptive execution of a batch of independent jobs. A multi-agent system (MAS) is deployed within the framework that engages a set of autonomous agents to accomplish various tasks. These agents have different sub goals and they independently work and cooperate with each other to achieve these goals. Three major tasks are accomplished by these agents; these include resource brokering, performance monitoring and analysis, and adaptive execution of jobs. The next section gives an overview of this multi-agent system. The primary focus of this paper is on performance monitoring and analysis and adaptive execution of jobs within the MAS. A number of Analysis agents are organized in a hierarchy and are employed to carry out performance analysis at various levels of the environment. An overview of the hierarchical Analysis agent framework is presented in Section 3. Performance monitoring in this work is based on the concepts of performance properties. Performance properties characterize specific performance behaviours of a program. Every performance property can be checked by a set of conditions and is associated with a severity figure. Severity indicates whether a performance problem has occurred during the execution of a job. Section 4 discusses the performance properties and their relevance in the current work and Section 5 presents the

A. De Sarkar et al.

notion of severity along with its computation techniques. Section 6 discusses the adaptive execution scheme and describes the algorithms for maintaining quality of services and agent interactions. Implementation of the scheme is discussed in Section 7. A number of experiments have been carried out to demonstrate the adaptive execution scheme and its effectiveness. Details of the experimental setup and results of the experiments are discussed in Section 7. Related work is discussed in Section 8. Section 9 concludes with a direction of the future work.

2 Overview of the Multi-Agent System Within a MAS, multiple sub-organizations may coexist and perform their tasks autonomously [7]. Such sub-organizations can be found when there are segments of the overall system that fulfill any of these conditions: (a) Exhibit a behavior specifically oriented towards the achievement of a given sub-goal. (b) Interact loosely with other segments of the system. (c) Require competences that are not needed in other parts of the system. In order to identify sub-organizations within a MAS, goals and sub-goals of the system must be recognized first. For the MAS presented in this paper, four major sub-goals are identified: • • • •

resource brokering, controlling the execution of a batch of jobs, performance analysis of the infrastructure and individual jobs, and performance improvement of the jobs whose performances have degraded.

Therefore, four sub-organizations may be identified as shown in Fig. 1. The responsibilities of these four sub-organizations are (1) finding suitable resources for each job as per their resource requirements, (2) controlling the concurrent execution of jobs, and keeping the resource utilization cost for the entire batch as minimum as possible, (3) performance monitoring of the jobs and the infrastructure, and (4) improving performance at

Achieving Guaranteed Performance in Computational Grids

Fig. 1 Sub-organizations

run-time in case of any performance problems. The tasks are again subdivided as given below: 1. In the Resource Broker component, the tasks are (i) resource brokering, and (ii) resource management. Resource management is done on behalf of the resource providers. 2. In the Job Controller component, (i) a supervisory task oversees execution of all jobs in a batch, and (ii) multiple subordinate tasks control the execution of individual jobs. SLA negotiation is also task of the Job Controller component. 3. In the Analyzer component, performance analysis is done at different levels (explained later). 4. In the Performance Tuner component, actions are taken for improving the performance of individual jobs. The actions may involve performance tuning at local level or rescheduling. Altogether six types of agents are identified for carrying out the tasks. These are: (i) Broker Agent (BA), (ii) ResourceProvider Agent (RPA), (iii) JobController Agent (JCA), (iv) JobExecutionManager Agent (JEM Agent), (v) Analysis Agent, and (vi) Tuning Agent (TA). Analysis Agents are further subdivided into different categories as explained in the next section. Activities of other agents are summarised below. A Broker Agent is responsible for finding suitable resources for a batch of jobs submitted to the system. Broker Agent prepares a Job Requirement List (JRL) from the description of each job. On the other hand, ResourceProvider Agent on a particular resource provider prepares a ResourceSpecificationMemo specifying the resources that

111

are available for the clients. The ResourceSpecificationMemo is stored in a ResourceSpecificationTable (RST), which is later used by the Broker Agent for resource brokering. The JRL is matched with the entries in RST and the match response along with a ResourceProviderList is sent to the JobController Agent. A ResourceProviderList is a collection of all the resource providers who can meet the requirements of a particular job. The collection of all ResourceProviderLists for all the concurrent jobs in a batch forms a JobResourceMatrix. A ResourceProviderList for job Ja becomes a row in the JobResourceMatrix and the i-th cell of this row is 1 if RPi can satisfy the requirements of Ja , it is 0 otherwise. Once the JobResourceMatrix is ready, JobController Agent decides an optimal mapping of all jobs in the batch onto the Grid resource providers and accordingly creates a JobMap. This agent is also responsible for establishing Service Level Agreements (SLA) between the client and the resource owners. While JobController Agent creates and maintains the JobMap, performs initial scheduling of jobs and later keeps track of the execution of all these jobs, a JobExecutionManager Agent becomes associated with each job and controls and keeps track of its execution. When a job is submitted to a particular resource provider, the agent moves there along with the job, liaises with the Analysis and Tuning Agents deployed on the resource provider and performs rescheduling (migration) of the job after being instructed by the Tuning Agent. If performance of a job degrades (when identified by the Analysis Agents), appropriate actions are decided by the Tuning Agent. Actions may involve improvement in the local runtime environment or migration to a different resource provider. Some of these actions are discussed in Section 6. In case a job needs to be migrated to a different resource provider, the JobExecutionManager Agent establishes an SLA with the newly selected resource provider and moves along with the job and resumes execution on the resource provider. Detailed discussion on the design and implementation of a major part of the MAS can be found in [28]. All the agents mentioned in this section collectively implement the adaptive execution scheme which is presented in this paper.

112

A. De Sarkar et al.

Fig. 2 The global grid model [24]

Site B

Site A

Grid

Grid

Grid

Grid

SMP

SMP

Cluster

Cluster

UI/API Grid UI/API Grid

Site C UI/API Grid

Grid

Grid

Grid

SMP

SMP

Cluster

Cluster

3 Hierarchical Organization of Analysis Agents

Grid

Grid

Grid

Grid

Grid

SMP

SMP

Cluster

Cluster

grid resource registries and grid security services. Organization of a Global Grid is shown in Fig. 2. Evidently, such a system must be monitored at different levels and therefore Analysis Agents of the MAS should be deployed accordingly. In this paper, a hierarchical organization of the Analysis Agents is proposed, where agents are divided into the following four logical levels of deployment in descending order: (1) Grid Agent (GA), (2) Grid Site Agent (GSA), (3) Resource Agent (RA) and (4) Node Agent (NA). At the four levels (as shown in Fig. 3) of agent hierarchy, each agent has some specific responsibilities as discussed below. At the first level of the hierarchy, individual workstations, SMP nodes and each individual node in the clusters are considered as providers of

Charles Kesler [24] describes an Enterprise Grid Model with heterogeneous resources, such as clusters, SMPs, and workstations of dissimilar configurations, but all are tied together through a Grid middleware layer. The resources may not be under a single administrative domain, are loosely coupled and connected via 100 or 1,000 Mbps Ethernet. A single resource registry and a grid security service are offered by the middleware layer. On the other hand, a Global Grid is described as a collection of enterprise Grids, which are loosely coupled between sites. There is not much control over QoS, resources belong to mutually distrustful administrative domains and usually have multiple

Fig. 3 Hierarchical organization of analysis agents

RA

SMP

NA

RA

NA

NA

RA

RA

Cluster

SMP

NA

...

NA

NA

RA RA

NA

NA

Cluster

NA

GA: Grid Agent, GSA: GridSite Agent, RA: Resource Agent, NA: Node Agent

...

NA

Achieving Guaranteed Performance in Computational Grids

computational resources (resources include CPU, memory etc.). Node Agents (NAs) work on these computational resources and their responsibility is to analyze the performance of the resources and the jobs running on these resources. At the next level, a Resource Agent (RA) is deployed which analyses the overall performance of the nodes in a cluster. Thus, when a collection of cluster nodes (each node can be an SMP or workstation or a desktop PC) acts as a resource provider, the RA is responsible for looking after the performance at the cluster level (such as cluster level load balancing). Cluster configuration can be homogeneous or heterogeneous. At a higher level, a Grid Site Agent (GSA) is deployed. This agent is responsible for the all Grid resources located at a particular geographical site. The resource providers at a particular Grid site may be under a single administrative domain and may have different policies regarding resource availability and usage. The GSA must be able to interact with all these resource providers. A GSA is also useful to identify any fault in any of the resources and also to check whether any of the resources is overloaded. Finally, at the highest level, a Grid Agent (GA) is associated with the global Grid and looks after the health of the entire Grid. There is only a single GA, which liaises with all the GSAs and analyzes the overall performance of the resources at each Grid site. This analysis is based on the data collected from the GSAs. Thus each GSA must regularly update the GA for enabling it to carry out the analysis.

4 Performance Properties Performance monitoring and analysis of MAS depends on the notion of performance properties. Performance properties have been introduced in [12] and thoroughly discussed in [13]. A performance property characterizes specific performance behaviour of a program, such as load imbalance, cache misses and communication. Every performance property can be checked by a set of conditions and is associated with a severity figure which shows how important the performance property is in the context of the performance of

113

an application. When the severity figure crosses a predefined threshold value, the property becomes a performance problem and the performance property with highest severity value is considered to be a performance bottleneck. Performance properties for OpenMP and MPI programs have been defined in detail in [12, 14, 26]. Properties defined by the Julich Supercomputing Centre [26] attempts to capture the program behaviour with finer details. However, objective of the current work is to focus on a framework for adaptive execution based on the identification of certain performance properties that need to be handled in order to improve the execution performance of a program. Thus, the current work considers only some of the properties which are useful for the above-mentioned purpose. In particular, in this paper we only concentrate on OpenMP programs and demonstrate our work with two performance properties, namely InadequateResources and LoadImbalance. InadequateResources property identifies the problem related to execution of a portion of job sequentially (or on less number of processors) when additional number of processors is required in order to maintain the quality of services. A property called Non-parallelized code has been used in [13] that captures the overhead due to sequential part of the program. Furlinger et al. [14] defines UnparallelizedInSingleRegion and UnparallelizedInMasterRegion to capture the situation when time is lost due to a single thread executing a construct. However, the InadequateResources property defined in this paper not only captures the problem due to sequential part of a job, but also it detects performance problems when a job runs parallelly using less number of processors than it requires to use in order to maintain the guaranteed performance. LoadImbalance property specified in this paper addresses the issue of load imbalance in parallel loops, thus is similar to the ImbalanceInParallelLoop property defined in [14]. Another novelty of our work is that, severities for these properties are computed only when the significant parts of the jobs (defined in the SLA which will be discussed in the next section) are executed. The system makes no attempt to measure the severity in other parts of the job, thereby reducing the overhead due to collection

114

of monitoring data and necessary computation. There are many other properties, which need to be considered for effective tuning of performance of jobs. All these properties and appropriate runtime tuning actions for these properties will be dealt with in our future work. MPI programs will also be considered in the future work. Next section discusses how the QoS is handled within the MAS and presents severity computation techniques in order to identify the performance bottlenecks.

5 QoS and Severity Computation In a computational Grid environment, quality of services (QoS) is primarily defined in terms of the execution performance of an application or its components. A client submits the QoS requirements and it is guaranteed by a resource provider through the establishment of an SLA. SLA establishment follows the process proposed in [9]. At the time of initial allocation of a job onto a resource provider, the client first sends Task Service Level Agreement (TSLA) (JRL in the current work) to the resource broker component. A TSLA has two parts; in the first part the requirements for a job (such as preferred start-time, end-time, expected completion time, number of processors) are specified and in the second part information about every significant region (loop) in the program (such as nesting level, loop bounds) are specified in the form of job metadata. A Resource Service Level Agreement (RSLA) (here ResourceSpecificationMemo) containing information about the capabilities of a resource provider is sent by the resource provider to the resource broker component. When a specific resource provider is selected for a job [29], the SLA is finalized in the form of a Binding Service Level Agreement (BSLA), which is a tripartite agreement between the client, the resource provider and the brokering component of the system. BSLA (referred as simply SLA) contains information about the requirements of the job and availability of resources as committed by the resource provider. In the current work, the client is expected to mention the expected completion time (Tect ) (this

A. De Sarkar et al.

is basically the estimated execution time) for a job in the JRL and consequently this is included in the SLA. This information is generally based on some pre-execution analysis of the job or on historical data gathered after any previous execution of it. For every job, a JobMetaData is prepared and submitted by the client that contains information about its significant regions. At the time of establishing the SLA, this becomes part of the SLA. Severities of the performance properties in some cases are computed using the information stored in the JobMetaData. In this work, only loops are considered as significant regions and severity values are computed during the execution of these significant regions only. Whenever, a significant region starts executing, the NA introduces certain measurement points within the region. A factor f (currently, we are only considering loops, therefore f is only fraction of the total number of iterations of the loop) is communicated to the JEM Agent, which controls the execution of the job. The factor f indicates the distance between two measurement points (initially it is the distance of the first measurement point from the start of the significant region). At each point, performance data related to the execution of the job are collected and sent to the NA for computation of severity values. Occasionally, NA may revise the factor f depending on the requirements. Severity computations for the two performance properties used in our work are discussed below. 5.1 Severity Computation for InadequateResources Property Severity computation for the InadequateResources property is explained in this section by first considering a job with a single significant loop and a single measurement point. Let us consider that the job starts executing serially on a particular resource provider in the Grid. The following notations are used in this section: •



f is the distance of a measurement point from the beginning of the significant loop of the job, i.e. f portion of the loop is executed before collection of performance data. Expected completion time of the job is Tect (as defined by the client)

Achieving Guaranteed Performance in Computational Grids



Actual completion time of the job till a measurement point ( f portion) is Tfact .

Therefore, time remaining for execution of the rest of the job (1 − f portion) is (Tect − Tfact ). Estimated completion time for the remaining part of the job (1 − f portion) is Tfact × (1 − f )/ f . Now, considering that the job runs on multiple processors in ideal time (no overhead is present), remaining portion (1 − f ) of the job will finish within the expected completion time if p (say) number of processors is allocated for it, where p ≥ [Tfact × (1 − f )]/[(Tect − Tfact ) × f ]

(1)

[In ideal condition, Tfact + [(Tfact × (1 − f )/ f )/ p] ≤ Tect i.e., [(Tfact ×(1− f )/ f )/ p] ≤ (Tect − Tfact )] Hence, p ≥ the ratio of estimated completion time and actual remaining time. Therefore, severity for InadequateResources property is defined as the ratio of estimated completion time and actual remaining time, and if it is greater than one, additional resources (here processors) are required to complete the job in time. If it is less that one, it is expected that the job will finish in time even with a single processor. However, in a dynamic environment, the performance of a job must be monitored on regular basis. Therefore, multiple measurement points should be inserted in the significant region. Let us consider a case where the job is suspended after execution of f1 , f2 , f3 , . . . , fi portions of a loop and performance data is collected (i.e. f1 , f2 , f3 , . . . , fi are the measurement points for the loop). Let t1 , t2 , t3 , . . . , ti are the execution times of each portion (as per the data collected after each measurement point). Thus, the portion of the job remaining to be executed is: 1 − ( f1 + f2 + f3 + . . . + fi ) = 1 −

i 

fk

(2)

k=1

The actual completion time of the job till the last measurement point ( fi portion) is given by the following equation: Tfact = t1 + t2 + t3 + . . . + ti =

i  k=1

tk

(3)

115

Estimated time to complete the entire job (if no change occurs after the last measurement) is given by (4),  Tfact + 1 −

k=1

 + 1−



i 

i 

i   tk × ti fi =

fk 

fk

k=1

 × ti fi

(4)

k=1

In the above estimation, it is assumed that the job and the environment may be tuned at each interval, but last estimation is based on the current environment. Again considering that the job runs on multiple processors in ideal time, remaining portion of the job will finish within the expected completion time, Tect , if p number of processors are allocated for its execution. Here,  p≥

   i i    1− fk × ti fi Tect − tk k=1



i  Because, tk + k=1

(5)

k=1

    i   p ≤ Tect 1− fk ×ti fi



k=1

Hence, here also the condition to be satisfied is that p should be greater than or equal to the ratio of estimated completion time and actual remaining time and as before, severity is defined as the ratio of estimated completion time and actual remaining time. Thus, 

 f f sev_resr (Li ) = Tfact (Li ) Tfect (Li ) ,

(6)

where, f  is the remaining portion of the significant region. In case there are multiple significant regions in a job, the expected completion time for each region is computed as Tect (Li ) = Tect × Lifrac , where Lifrac is the proportionate execution time of the loop Li compared to the total execution time of the job. Values corresponding to each Lifrac are obtained from historical data and stored in JobMetaData for future use. Severity of

116

A. De Sarkar et al.

InadequateResources property for each significant region is then computed by following the above procedure. If the job initially starts executing on multiple processors instead of single processor, the above computation may be slightly modified for estimation of remaining execution time at any particular measurement point. If the job initially starts executing on q processors, severity of InadequateResources property can be computed as, 



f f (7) sev_resr (Li ) = Tfact (Li ) q × Tfect (Li )

6 Adaptive Execution In Grid environment, resources are owned by different resource-owners and are administered under multiple administrative domains. Status and load of the resources change frequently and the jobs need to adapt to the changes so that desired performance can be achieved. An adaptive execution scheme is presented in this section and is implemented within our multi-agent framework. Section 6.1 provides the overview of the scheme and Section 6.2 presents the interactions among the agents for maintaining the QoS. 6.1 Overview

5.2 Severity Computation for LoadImbalance Property Severity for LoadImbalance property is measured following the proposal of Bull et al. [5]. In case of LoadImbalance property, execution times for every thread are measured. Thus, the severity figure for a loop Li is given as follows: sev_load(Li )



f f f = Tmax(Li )−Tavg(Li ) ×100 Tmax (Li )

f

(8)

where, Tmax (Li )is the maximum time spent in a thread to execute f portion of the loop and f Tavg (Li )is the average taken over the times spent in all threads to execute f portion of the loop. In both cases (and in case of other performance properties as well), when the above severity values are greater than a threshold, the corresponding performance problem is detected. Such problems indicate that the QoS cannot be maintained, or in other words guaranteed performance cannot be achieved. Therefore, the agents involved with the analysis process take some action so that the performance problem can be overcome. Next section describes the operations and interactions of the agents in order to explain how QoS is maintained in the system. However, before presenting the algorithms, an adaptive execution scheme for achieving guaranteed performance of a batch of jobs is briefed in Section 6.

Initial Allocation Whenever, a client submits a batch of jobs (components of one or more applications), the client also specifies resource requirements for each of them. On the basis of job requirements and resource capabilities, a resource selection algorithm is run [29] to select an initial set of resources for allocation of the jobs. For each job, an SLA is established between the selected resource provider and the client. As mentioned, the SLA contains expected completion time (Tect ), which provides a basis of the performance guarantee for the job. Consequently, the next objective of the multi-agent framework is to maintain this performance guarantee by adaptively executing the job. In general, a job may adapt to its current runtime environment depending on its own resource requirement and the present resource usage scenario. Additionally, adaptive execution may also be provided by automatic migration of the job in any of the following cases: (i) performance degradation, (ii) changes in job requirements, (iii) resource failure, (iv) any kind of fault in the execution environment. The MAS described in this paper enables adaptive execution either by attempting to adapt to the current environment on the same resource provider or by migrating the job to a different resource provider. When a performance problem is identified (severity figure is above a threshold value), it first applies local tuning techniques, i.e. attempts to improve the underlying execution environment. However, when resource requirements cannot be fulfilled on the

Achieving Guaranteed Performance in Computational Grids

current resource provider, the job is migrated to another appropriate resource provider. Two important functionalities are necessary for enabling the adaptive execution scheme, one is run-time performance analysis and the other is deciding the course of actions for performance tuning of jobs. Performance Analysis Following an SLA establishment, a job is submitted onto the selected resource provider and a JobExecutionManager Agent (JEM Agent) (which is a mobile agent) becomes associated with it and is deployed on the same host along with the job. A Node-level Analysis Agent (NA) [10, 11] sitting on the resource provider monitors execution of all jobs submitted to that resource provider (possibly by different clients) and analyses their performances. Performance analysis is done at periodic intervals (may be different for different jobs running on the same resource provider) as discussed in the earlier section. NA instructs the JEM Agent to suspend the job after executing certain portion (after reaching a measurement point) of it, gathers all relevant performance data at this point and analyses these data. In case NA identifies a performance problem and predicts that the resource provider may fail to maintain the performance guarantee, it invokes a Tuning Agent (TA). Tuning Actions TA decides either to locally tune the job or to migrate the job to a different resource provider. A decision regarding the next course of action (local tuning or migration) is taken on the basis of the nature of the performance problem and availability and current status of the resources. Thus, the job may execute on more number of processors in case sufficient resources are available on the same resource provider or may be tuned to reduce load imbalance. Other performance problems may also be tackled accordingly. In certain scenarios, the TA may decide to migrate the job. This may happen primarily in the following two cases: (i) if resources on the same resource provider are overloaded, and (ii) if a higher capability resource provider is required in order to achieve the guaranteed performance. Surely, there may be other scenarios when job migration is necessary (like resource failure), al-

117

though they are not being considered in the current work. At the time of job migration, JEM Agent participates in resource-brokering—with the assistance of the GSA it selects a new suitable resource provider [29] and reschedules the job to this newly selected resource provider. JEM Agent, being a mobile agent, moves along with the job, and resumes its execution from the point at which it has been suspended. SLA Management Every time, when a job is allotted or migrated to a new resource provider, an SLA is established between the client and the resource provider. After initial allocation, JEM agent carries the SLA with it. At the time of migration, a new resource provider is selected (following the steps described in Section 6.2.3) and JEM Agent sends the SLA to it for acceptance. It accepted, JEM Agent along with the SLA and the associated job moves to the resource provider. However, even after several tuning actions and migration actions, there are possibilities that the SLA is violated. In such cases, we propose that the current execution of the job is completed; entire execution history of the job is reported to the GA. This information may be stored and used in future for allocation and execution of similar types of jobs. We are currently working with job modeling techniques to address this issue. The scheme described in this section has been implemented as a tool, called PRAGMA. Some implementation details and experiments using the tool are discussed in Section 7. Section 7 also highlights the observation that the costs involved with monitoring, analysis and migration are negligible. 6.2 Agent Interaction for QoS Maintenance In this section, we focus on the interactions among agents involved with the following three services: (i) run-time performance analysis of jobs, (ii) runtime local tuning of jobs, and (iii) run-time job migration to other resources. 6.2.1 Performance Analysis NAs are always active on each resource provider and are responsible for run-time performance analysis of all the jobs running using the

118

local resources. When execution of a job starts, the JEM Agent associated with it sends ‘PRAGMA_ REGIS’ message to the NA. NA, in response to it sends a ‘PRAGMA_DATA’ message back to the JEM Agent indicating the fraction of each significant region to be executed before collection of any performance data ( f1 , f2 , f3 , ..., fi etc.). At each measurement point, the JEM Agent collects performance data and sends the data using a ‘PRAGMA_EXECUTION’ message to the NA and suspends the job. NA computes severities of different performance properties (such as sev_resr( f Li ) and sev_load( f Li )) for the particular significant region. If one or more properties are detected as severe, NA invokes TA and sends ‘PRAGMA_WARNING’ message to the JEM Agent. In case, none of the measured properties are severe, NA sends ‘PRAGMA_RESUME’ message to the JEM Agent for resumption of the job. The ‘PRAGMA_RESUME’ message also carries a revised value of f which indicates the next portion of the significant region to be executed before arriving at a measurement point. NA continues this iterative performance analysis till the end of the job. Algorithms for NA and JEM

Fig. 4 Algorithm for Node Agent (NA)

A. De Sarkar et al.

Agent are presented in Figs. 4 and 5 respectively. Algorithm for JEM Agent also shows the steps required for migration of jobs; these steps are explained in Section 6.2.3. In the following algorithms, only messages communicated between the agents and the methods called by them are highlighted. Details are not included here.

6.2.2 Local Tuning Like NAs, TAs are always active on each local resource provider in the Grid. A TA is invoked by NA and receives appropriate message from NA. The message indicates the performance problem and contains a job_id and the severity value of the related performance property. If InadequateResources property becomes severe, TA first checks resource availability on the local resource provider. If severity value of InadequateResources property is s, then following the definition of severity in this case, number of processors is increased to s. If required number of processors is available on the same resource provider, a message, ‘PRAGMA_RESUME’, indicating the

Achieving Guaranteed Performance in Computational Grids

119

Fig. 5 Algorithm for JobExecutionManager Agent (JEM Agent)

required number of processors is sent to the JEM Agent. In order to increase the number of processors, number of threads in the parallel region is increased (this is done by increasing the number of OpenMP threads) and we assume that these processors are not shared by any other job. If processors are not available (or busy), an appropriate message, ‘PRAGMA_RESCHEDULE’, is sent to the JEM Agent, so that migration decision can be taken.

In case of LoadImbalance, local scheduling strategy is changed. In the current work, OpenMP scheduling strategies are used to reduce the effect of load imbalance (such as, from static scheduling, it may be changed to dynamic scheduling). However, in the future implementations, more sophisticated mechanisms will be incorporated. TA instructs the JEM Agent to change the strategy, and JEM Agent applies the change to the suspended job and executes the remaining job with

120

A. De Sarkar et al.

Fig. 6 Algorithm for Tuning Agent (TA)

new strategy. Scheduling strategy recommended by the TA is passed to the JEM Agent as a message using ACL (Agent Communication Language) standard. The JEM Agent extracts the scheduling strategy from the received ACL message, and passes it to the job as a parameter which is set as the new OpenMP scheduling strategy for executing the next portion of the job. In order to pick up an appropriate strategy, TA takes into account the shape of the loop (triangular, square, irregular etc.). Shape of the loop is either obtained from the JobMetaData (known through static analysis) or decided at the time of execution. A knowledge base is used to obtain the appropriate scheduling strategy for a particular loop shape on a particular platform [17]. Algorithm for TA is presented in Fig. 6. Algorithms for addTuningAction() methods of these two properties are presented in Figs. 7 and 8. Currently, TA uses a simple algorithm because it handles only two performance properties. However, in a complicated scenario a better algorithm and a proper use of a knowledge base should be introduced.

6.2.3 Job Migration The major decision to be taken at the time of migration of a job is where the job is to be migrated. Three agents take part in this process. These are: (i) GSA, which is a higher-level agent in the hierarchical agent framework [11], (ii) RPAs, which are deployed on each resource provider [27], and (iii) JEM Agent, which carries the ResourceProviderList for a job (see Section 2). GSA monitors all the grid resources located at a particular geographical site, while RPAs maintain information and establish SLAs with the client. Resource ProviderLists carried by the JEM Agents are sorted in the ascending order of resource utilization costs [29]. During the execution of a job, if TA indicates that migration is required, the ResourceProviderList is sent to GSA. GSA consults the RPAs to find resource availability and current load of each resource provider (we use Ganglia information provider [16] to assist the RPA). It sends back a RevisedResourceProviderList, which contains only the resource

Fig. 7 Algorithm for addTuningAction of InadequateResourcesTuningSpecification

Achieving Guaranteed Performance in Computational Grids

121

Fig. 8 Algorithm for addTuningAction of LoadImbalanceTuningSpecification

providers who can currently fulfill the requirements of the job in order to achieve the guaranteed performance. Because Grid is a dynamic environment, this step helps to take into account any change in the resource pool. The JEM Agent selects a Resource Provider from the list, sends a modified SLA (as part of the job has been already executed, SLA is modified) to the corresponding remote RPA. If the RPA accepts the SLA, job is migrated to the particular resource provider along with the JEM Agent; otherwise, the next Resource Provider from the list is contacted.

7 Implementation of the Adaptive Execution Scheme The entire multi-agent framework [27] has been implemented using Java RMI technology. Qualitative requirements of multi-agent system, such as transparency, interoperability with other systems and portability encourage development of PRAGMA using technologies based on Java RMI. Being an object oriented language, Java uses method invocation as its major means of communication. A major advantage of Java is that it incorporates communication mechanisms inside the language environment, whereas other languages (e.g., Fortran or C++) require external mechanisms (libraries) like message passing. Thus, with high-speed network support (e.g., Java high performance sockets library), a high performance Java RMI implementation is possible. Moreover, with Java RMI, object parameters are passed by object serialization. It is known that the serialization performance dominates overall performance of Java RMI. A part of the hierarchical analysis agent framework has been implemented using Jade frame-

work [1]. Jade is a Java Agent Development Environment, built with Java RMI registry facilities. It provides standard agent technologies. The agent platform can be distributed on several hosts; each one of them executes one Java Virtual Machine. Jade follows FIPA standard, thus its agent communication platform is FIPA-compliant. It uses Agent Communication Language (ACL) for efficient communication in distributed environment. Jade framework supports agents’ mobility and agents can execute independently and parallelly on different network hosts. Jade uses one thread per agent. Agent tasks or agent interactions are implemented through the logical execution threads. These threads or behaviors of the agents can be initialized, suspended and spawned at any given time. In Jade, multiple agents can interact with each other using their own containers [19]. Containers are the actual environments for each agent. Typically multiple agents can be active at same time on different nodes with various containers. But there is only one central agent, and it needs to start first. In the current implementation, the NA is first initiated as central agent, which coordinates with other agents. In a Grid environment, it is important that performance analysis be done at run-time and tuning action is taken at runtime without incurring much overhead. Thus, the active Jade agents on a Grid resource cooperate with each other and interact in order to detect performance problems in real time. Migration within the system using mobile agent has been implemented using Jini technology [22, 23]. Byassee in his article “Unleash mobile agents using Jini” [6] discusses in detail how mobile agents can be implemented on a Jini platform. A similar approach has been adopted in the implementation of mobile part of this system. In general, Jini offers a wide range of supporting services

122

for communications based on Java RMI technology. Jini provides the facility of adding new agents in the system without hampering the existing parts of the system at run-time also. This increases the functionality and scalability features of the system. Agents can interact with each other within the Jini network through the request/response framework or through some protocol like XML,

Fig. 9 Class diagram for JEM Agent, Node Agent, Tuning Agent and GridSite Agent

A. De Sarkar et al.

KQML. The primary advantage of Java RMI and therefore of Jini is interoperability between heterogeneous platforms. The distributed objects can invoke each other’s methods even if the underlying platforms such as processors, operating systems, and programming languages are heterogeneous. RMI assures interoperability by serializing and deserializing argument objects.

Achieving Guaranteed Performance in Computational Grids

Serialization is the packing of argument objects into a network message. In this step, platform dependent, i.e. in-memory representations of the objects are converted into platform independent, i.e. canonical representations. Deserialization is the reverse action; argument objects are unpacked from a network message and are converted from canonical representations into in-memory representations. The use of canonical representations allows the sender and receiver to communicate with each other even if they use different memory layouts. Class diagrams for implementation of the agent framework for performance analysis and adaptive execution are presented in Figs. 9 and 10. Figure 9 presents the class diagram for interactions of each agent with two base classes, PerformancePropertySpecification and PerformanceProperty-

123

TuningSpecification. In addition to this diagram, Fig. 10 shows the association of these two base classes with other sub classes.

7.1 Experimental Setup The above adaptive execution scheme has been tested on a local Grid test bed that has been set up with heterogeneous nodes running with Linux. The computational nodes of the test bed include Intel core2 Duo PCs and Intel Pentium-4 PCs, HP NetServer LH 6000 with two processors, IBM Power4 pSeries 690 (P-690) Regatta server with 16 processors and HP ProLiant ML570 G4 with four Intel core2 Duo processors (HpServer). The nodes communicate through a fast local area network. The Grid test bed is built with Globus

Fig. 10 Class diagram for performance property specification and performance property tuning specification

124

Toolkit 4.0 (GT4.0) [18]. Ganglia Information Provider [16] is installed on each node of the Grid test bed. GT4.0 comes with Monitoring and Discovery Service (MDS4), which is able to use Ganglia and Hawkeye as external information providers. With the help of Globus MDS4 and Ganglia information provider, information about resources and their quality is collected by PRAGMA. In the current implementation of the multi-agent system, only information providers of Globus middleware are used. Integration of the framework with other components of GT4.0 (such as GRAM) has not yet been implemented. For experimentation purpose, SciMark 2.0 benchmark codes [32] for parallel C applications and JGF (Java Grande Forum) benchmark codes [21] for parallel Java applications have been used. Jobs in parallel Java are implemented using Java OpenMP, i.e. JOMP [4]. Applications in C OpenMP and Java OpenMP are executed to exhibit the interoperability of the mobile agent framework within MAS. Four different programs are considered which include: LU matrix factorization, Matrix Multiplication, Gaussian Elimination, Sparse Matrix Multiplication. LU matrix factorization and Matrix Multiplication programs are used to demonstrate adaptive execution by local tuning. While adaptive execution by rescheduling is demonstrated using Gaussian Elimination and Sparse Matrix Multiplication. Experiments are primarily carried out on our local Grid test bed. 7.2 Experimental Results The next two sub-sections present the results of the experiments carried out to demonstrate the efficiency of adaptive execution by local tuning techniques and rescheduling. 7.2.1 Adaptive Execution by Local Tuning We have experimented with periodic (Section 6.2.1) performance analysis and application of tuning techniques for both InadequateResources and LoadImbalance performance properties according to a given priority. As shown in the algorithms in Section 6.2.1, the NA first checks the InadequateResources property followed by

A. De Sarkar et al.

the LoadImbalance property periodically. In the experiments shown here, each job starts executing with a single thread (on a single processor) and static scheduling strategy. At each measurement point, whenever a performance problem is detected, a tuning action is taken. Tuning actions are decided on the basis of the discussion presented in Section 6.2.2 and the algorithms presented in the same section. Different metrics related to the execution performance of the jobs and the corresponding tuning actions at these measurement points are shown in Table 1 (for LU factorization job) and in Table 2 (for matrix multiplication job). The expected completion time for each job is referred as Tect and it is specified by the client at the time of job submission. After each measurement point, remaining Tect is computed as Trem_ect , which is computed as follows: Trem_ect = Tect − Tfact .

(9)

The actual completion time for a fraction f of a job is denoted by Tfact . The projected execution time, Tpact at each measurement point is also computed using the following equation:   i i      Tpact = ti fi × 100 − fk + tk . (10) k=1

k=1

Where ti is the time required to execute the fi portion of the job between the last two measurement points. Pact is computed to check whether there is a chance to meet the performance guarantee. Severity values corresponding to the two properties as each measurement points are also shown in the tables, sev_resr denotes severity of InadequateResources and sev_load denotes severity of LoadImbalance properties. The threshold value for both properties has been taken as ‘1’. The rationale for choosing such value for InadequateResources property has been explained in Section 5.1. However, for LoadImbalance property, this value indicates almost perfect parallelism, which is not the usual case in real-life programs. Nevertheless, the results reported in this paper have been obtained using only few numbers of processors (e.g. 2 or 4). Thus, for square loops (e.g. Matrix Multiplication), sev_load is generally less than one, whereas

Achieving Guaranteed Performance in Computational Grids

125

Table 1 Result of the experiments with LU Factorization job

Tfact (sec)

Tpact (sec)

23.71

474.12

Severity Tuning (sev_resr) Actions (sev_load) Experiment for LU Factorization of 4000 datasize with Tect 100 secs, i.e.1 min 40 secs.

Resource, Execution Setup

f

HpServer, 1 processor and static scheduling

05

HpServer, 6 processors and static scheduling

10

200-599

19.10

205.18

57.19

sev_resr = 3 sev_load = 8.72

No. of processors increased to 8, scheduling strategy changed to dynamic.

HpServer, 8 processors and dynamic scheduling

20

600-1399

15.21

107.43

41.99

sev_resr = 1 sev_load = 0.26

Not needed

Do

40

1400-2999

12.62

78.53

29.36

sev_resr = 1 sev_load = 0.30

Not needed

Do

Rem

3000-3999

0.70

71.34

28.66

No. of Iterations 0-199

Trem_ect

(sec)

76.29

sev_resr = 6 No. of processors increased to 6.

N.A.

Total time (sec)

71.34 = 1 min 11 sec.

N.A

Experiment for LU Factorization of 8000 datasize with Tect 860 secs, i.e.14 mins 20 secs. HpServer, 1 processor and static scheduling

05

HpServer, 6 processors and static scheduling

10

400-1199

161.80

1732.60

502.66

sev_resr = 3 sev_load = 13.95

HpServer, 8 processors and dynamic scheduling

20

1200-2799

122.86

879.47

379.81

sev_resr = 1 sev_load = 0.1

Do Do

0-399

195.54

3910.84

664.46

sev_resr = 6 No. of processors increased to 6.

575.17 = 9 mins 35 sec.

No. of processors increased to 8, scheduling strategy changed to dynamic.

Not needed 40

2800-5999

89.75

626.04

290.06

Rem

6000-7999

5.23

575.17

284.83

sev_load for triangular loops (e.g. LU Factorization) is high. Thus, a threshold value ‘1’ has been considered here. In real applications, when large number of processors is used, a threshold value ‘1’ may not be appropriate. The threshold must be decided on the basis of the nature of the application, loop shape, historical data and execution platform. On the basis of the severity values, some tuning actions are taken which are also described in the tables. Total execution time required for each job is shown in the last column to demonstrate that

sev_resr = 1 sev_load = 0.18 N.A.

Not needed N.A

the guaranteed performance could be achieved in all these cases. Figure 11 compares the expected completion time (provided by client), projected completion time (based on measured time in the execution environment) and actual completion time (time needed to complete the job) for each portion of the job and demonstrates that using the above scheme suggested in this paper, guaranteed performance could be achieved for both LU Factorization (Fig. 11a) and Matrix multiplication (Fig. 11b).

126

A. De Sarkar et al.

Table 2 Result of the experiment for Matrix Multiplication job

Resource, Execution Setup

f

HpServer, 1 processor and static scheduling

05

HpServer, 4 processors and static scheduling

10

Do Do Do

Tuning Total time (sev_resr) Actions (sec) t (sec) (sev_load) Experiment for Matrix Multiplication of 2000 datasize with Tect 70 secs, i.e.1 min 10 secs. No. of Iterations

Tfact (sec)

Tpact (sec)

0-99

12.31

246.22

Trem_ec Severity

57.69

sev_resr = 4 No. of processors in creased to 4.

100-299

6.19

71.12

51.50

66.71 i.e. 1 min 7 secs.

sev_resr = 1 sev_load = 0.76 Not needed

20

300-699

11.30

66.54

40.20

sev_resr = 1 sev_load = 0.84

Not needed

40

700-1499

22.72

66.72

17.48

sev_resr = 1 sev_load = 0.17

Not needed

Rem

1500-1999

14.18

66.71

3.30

N.A

N.A

Experiment for Matrix Multiplication of 4000 datasize with Tect 700 secs, i.e.11 mins 40 secs. HpServer, 1 processor and static scheduling

05

0-199

173.09

3461.72

526.91

sev_resr = 6

HpServer, 6 processors and static scheduling

10

200-599

60.71

749.87

466.20

sev_resr = 1 sev_load = 0.25

Do Do Do

No. of processors increased to 6.

667.52 = 11 mins 7 secs

Not needed 20

600-1399

95.52

639.77

370.68

sev_resr = 1 sev_load = 0.21

Not needed

40

1400-2999

208.60

668.29

162.08

sev_resr = 1 sev_load = 0.19

Not needed

Rem

3000-3999

129.60

667.52

32.48

N.A

Discussion Above results demonstrate that in all four cases, expected completion times were achieved with some tuning in the execution environment. Every time, required number of resources were available or using the maximum number of available resources (LU factorization with datasize 8000), performance guarantee could

be satisfied. However, this may not be the case for many jobs running on some resource providers. If required number of resources is not available, the job will be migrated as mentioned in Section 0. If migration is not possible (higher capacity resource provider may not be available) or even if after migration performance guarantee could not be

a

b 5000 4000 3000

4000

Tect

Tect Time (sec)

Time (sec)

Fig. 11 Comparison of performances of a LU Factorization and b Matrix Multiplication jobs

N.A

Tpact Tfact

2000 1000 0

3000 2000

Tpact Tfact

1000 0

4000

6000 Data Size

2000

4000 Data Size

Achieving Guaranteed Performance in Computational Grids 700 600 500

Time (Sec)

400 300 200 100 0 400

600

800

7.2.2 Adaptive Execution by Rescheduling

Scenario1(a)

Scenario2(a)

Fig. 12 Gauss Elimination—comparison of performances in Scenario1(a) and Scenario2(a)

grating the job from server2 (where 36.5% of its computation is done using two processors) to the P-690 Regatta server (server3) (where the remaining part is completed using four processors). In all the experiments of Scenario2, overheads are involved because of migration of jobs from one server to another server. These overheads are due to additional time required for discovering the services, rescheduling the jobs and establishing SLAs on a second resource provider. Figures 14 and 15 depict the times for executing

00

=5

00

00

,N

Z=

25

00

00 20 Z= ,N

N

=4

00

00

,N 00 00

00

50 00 0 Z= 1

00 00 0 Z= 1 =3 N

N

=2

00

00

00

00

,N

,N Z=

50 00 0

Time (Min)

100 90 80 70 60 50 40 30 20 10 0

=1 N

A job is rescheduled to a different resource provider for the reasons stated in Section 0. The experiments described in this section demonstrate performance improvement of jobs after rescheduling it. A parallel C code for Gauss Elimination and a parallel Java code for Sparse Matrix Multiplication have been used for this purpose. In each case, two scenarios have been compared— (1) in the first scenario, a job completes its execution on a resource provider to which it has been initially allocated, (2) in the second scenario, a job starts its execution on a resource provider and when a performance problem is indicated by the NA, it is check pointed and migrated to a new resource provider. When the jobs are migrated to a different resource provider, additional resources (e.g. processors, if available) are allocated to it as per the specifications in the modified SLA. Figures 12 and 13 compare the total time taken to execute the two codes in the above two scenarios. Execution times are measured for different data sizes. In case of Gauss Elimination (Fig. 12), the job is initially scheduled to a Pentium-4 PC (server1) and completes its execution on it with a single processor (scenario1(a)). In the second scenario (scenario2(a)), the job is initially scheduled to server1 (starts execution with one processor), and after completing a part (in this case half) of its computation, it is rescheduled to HP NetServer LH 6000 (server2) and executes the remaining part of the job on server2 using two processors. In case of Sparse Matrix Multiplication (Fig. 13), scenario1(b) demonstrates the execution of the job entirely on server2 with two processors and scenario 2(b) demonstrates the results of mi-

1000

Data Size

N

met, a ‘PerformanceDetails’ report (an analysis report) is generated after completion of the job and is sent to the GA with all details related to the execution of the job. In the above four cases, expected completion times were obtained using execution history of the job. Currently, an ongoing research focuses on job modeling techniques, which can be used to estimate the expected completion time using detailed information (static information and also execution history) about a specific job.

127

Data Size Scenario1(b)

Scenario2(b)

Fig. 13 Sparse matrix multiplication—comparison of performances in Scenario1(b) and Scenario2(b)

128

A. De Sarkar et al.

to the performance improvement of the suffered job it is negligible.

Data Size

1000

800

600

8 Related Work Server1 Rescheduling Overhead Server2

400 0

200

400

Adaptive execution has received a lot of attention from the Grid researchers. GrADS project [2, 3] presents a program execution framework, which supports adaptive reallocation if performance degrades because of changes in the availability of Grid resources. It also implements rescheduling methods on the basis of resource selection model and monitors performance contracts using simple stop/migrate/restart strategy. The performance contracts specify the expected performance of modules as a function of available resources in Grid. In [8], a job-monitoring technique and an analyzer tool for automatic generation of the required job behaviour description are proposed. This work gathers resource access information, produces job behaviour description after the analysis. Then the job behaviour description is used to choose the scheduling algorithms of jobs in Grid. A low-cost rescheduling policy is proposed in [30]. With this policy, after the initial mapping, some jobs are selectively rescheduled on the basis of their run-time performance analysis. It considers rescheduling at a few, carefully

600

Time (Sec)

Fig. 14 Gauss Elimination—results of rescheduling from Server1 to Server2 in case of Scenario2(a)

the jobs on two different servers and the associated overheads. Discussion Above results clearly indicate that migration is a good decision when the associated overhead becomes insignificant compared to the benefit achieved because of migration. This set of experiments demonstrates the cases when an adaptive execution environment can be offered. In both experiments, the decision for rescheduling has been proved to be beneficial. For obvious reason, when the job migrates to Server3 and executes with additional processors, performance is always improved. Rescheduling overhead is involved in all these experiments, but with respect

20 ,N Z= 00

00

00

00

=4 00

,N

Z= 15

N

00

00

00

=3

Server2

Z= 10

N

Rescheduling Overhead

N 0

00 0,

00

Server3

00 0,

N

Z= 50

=2 0 N 10

Data Size

N

00

=5

00

00

00

,N Z=

25

00

00

Fig. 15 Sparse matrix multiplication—results of rescheduling from Server2 to Server3 in case of Scenario2(b)

0

5

10

15

20

Time (Min)

25

30

35

Achieving Guaranteed Performance in Computational Grids

selected points during execution. In this work, the change in the resource pool is not considered. In [37], HEFT-based adaptive rescheduling algorithm is presented with collaboration between workflow planner and executor. In this approach, the executor will notify the planner about any run time event, through which planner gets information about resource unavailability or discovery of new resources. This research adapts the resource pool change to achieve better performance and reschedule the remaining jobs if necessary. A workflow management system defines, manages and executes workflows on computing resources. Characterization and classification of various approaches for building and executing workflows on Grids is discussed in [36]. This paper also presents a survey of existing workflow management systems for Grid. A dynamic approach to the performance instrumentation, monitoring and analysis of Grid workflows is discussed in [34]. The ICENI project puts emphasis on the development of a component and platform independent framework for generality and simplicity of use in Grid environment [15]. ICENI tools provide optimized deployment of Grid application components via performance guided implementation selection. Component based application model is used in ICENI, where domain specific knowledge is encapsulated within software components. ICENI also supports dynamic extension by its ability to instantiate and connect new components to an existing deployed application. Multiple steering and visualization components can also be configured to provide collaborative interaction sessions between trusted members. Minimum execution time and minimum execution cost are two constraints of user requirements, which are implemented within ICENI. In [25], a service based, end-to-end workflow processing in a Grid environment with heterogeneous resources is presented. The PerCo performance control framework migrates distributed scientific coupled models in response to changes in an execution environment [19]. This framework uses a feedback control mechanism and redeploys components based on regression analysis. Regression analysis is used with a performance repository to predict the execution time of the components on other machines under different load-

129

ings. However, this framework primarily works with component-based applications and a redeployment decision is made for each component on the basis of the predicted gain for the entire application. OpenMP based adaptive parallelism is presented in [31]. In this system, task parallel OpenMP applications can execute on network of workstations (NOW) with different nodes. The problem of resource selection and adaptation in grid environments is discussed in [35]. Adaptation of resources is done on the basis of degree of parallelism. In this work the adaptation coordinator periodically collects performance statistics and computes the weighted average efficiency. It compares the weighted average efficiency with certain threshold to add or remove processors from the cluster. The system, however, does not support run-time job migration. Previous research works mainly concentrate on run-time job migration. On the other hand, the work presented in this paper supports adaptive execution either by application of local tuning techniques or through run-time job migration depending upon the specific situation. Our research focuses on an integrated multi-agent-based framework, in which agents (including mobile agents) are deployed to coordinate the adaptive execution of multiple concurrent jobs. This is also a novelty of our system. In order to detect the performance problem, we have used the notion of performance properties and their severities. Moreover, the algorithms and strategies presented in this work not only aims at maintenance of performance guarantee, but also attempts to keep the resource utilization cost as minimal as possible. Thus, any over-provisioning is controlled by iteratively measuring the severity of performance properties and taking appropriate action immediately (sometimes even decreasing the allocated resources).

9 Conclusion In this paper, we have presented an adaptive execution scheme for a Grid environment based on an integrated agent framework. The scheme supports performance analysis of components of applications executing concurrently on different resource

130

providers and dynamically improves their execution performances. Unlike other research works, our scheme is based on the notion of performance properties, and supports both local tuning and migration of jobs depending on resource availability and the current resource usage scenarios. Algorithms used within the framework and interactions and exchange of information among the agents for collecting data, analyzing and improving the performance through application of various actions are explained in this paper. Various experiments have also been carried out which demonstrate functioning of the algorithms and effectiveness of the framework. It has also been observed, that the agent control overheads are negligible even when multiple jobs are submitted concurrently onto the same resource provider. The work presented in this paper has certain limitations. Such as, a batch of independent jobs without any communication requirements are considered, only two performance properties have been considered and tuning actions that are implemented in the current version of the tool PRAGMA are also simple. The work needs to be extended further to include more complicated situations and to exploit sophisticated knowledge base for appropriate tuning of the jobs. Currently, a portal is being developed for using the tool. Components of the tool should also be integrated before making it available for the use of other researchers. Acknowledgements This research work has been supported by the project entitled “Developing Multi-Agent System for Performance Based Resource Brokering and Management in Computational Grid Environment” funded by Department of Science and Technology, Government of India under the SERC scheme.

A. De Sarkar et al.

4.

5.

6.

7.

8.

9.

10.

11.

12.

13.

References 14. 1. Bellifemine, F., Poggi, A., Rimassa, G.: Developing multi agent systems with a FIPA-compliant agent framework. Softw. Pract. Exp. 31, 103–128 (2000) 2. Berman, F., et al.: New Grid scheduling and rescheduling methods in the grads project. Int. J. Parallel Program. 33(2), 209–229 (2005) 3. Berman, F., et al.: Toward a framework for preparing and executing adaptive grid programs. In: Proceed-

15.

ings of the 16th International Parallel and Distributed Processing Symposium, pp. 322 (2002) Bull, J.M., Kambities, M.E.: JOMP—an OpenMP-like interface for java. In: Proceedings of the ACM2000 Java Grande Conference, pp. 44–53 (2000) Bull, J.M., Ford, R.W., Dickinson, A.: A feedback based load balance algorithm for physics routines in NWP. In: Hoffman, G.-R., Kreitz, N. (eds.) The Proceedings of the 7th Workshop on the Use of Parallel Processors in Meteorology, pp. 239–249. World Scientific, Hackensack, NJ (1996) Byassee, J.: Unleash mobile agents using Jini, Leverage Jini to mitigate the complexity of mobile agent applications. http://www.javaworld.com/javaworld/jw-062002/jw-0628-jini.html (2002) Castro, A.: Designing a multi-agent system for monitoring and operations recovery for an airline operations control centre. Master-thesis, University of Porto, Faculty of Engineering (2007) Csaba, L., Lorincz, T., Kozsik, U.A., Horvath, Z.: A method for job scheduling in grid based on job execution status. Multiagent Grid Syst. 1(3), 197–208 (2005) Czajkowski, K., Foster, I., Kesselman, C.: Resource and service management, Chapter 18. In: Foster, I., Kesselman, C. (eds.) The Grid 2: Blueprint for a New Computing Infrastructure, 2 edn. Kaufmann, San Francisco, CA (2003) De Sarkar, A., Ghosh, D., Mukhopadhyay, R., Mukherjee, N.: “Implementation of a Grid performance analysis and tuning framework using jade technology” has been accepted in The 2008 International Conference on Grid Computing and Applications (GCA’08: July 14–17, 2008) to be held in Las Vegas, USA (2008) De Sarkar, A., Kundu, S., Mukherjee, N.: A hierarchical agent framework for tuning application performance in Grid environment. In: Proceedings of the 2nd IEEE Asia-Pacific Service Computing Conference, IEEE APSCC 2007, pp. 296–303. Tsukuba, Japan, 11–14 December (2007) Fahringer T., Gerndt, M., Riley, G., Traff, J.: Knowledge specification for automatic performance analysis, APART technical report. Workpackage 2, identification and formalization of knowledge, technical report FZJ-ZAM-IB-9918, D-52425 Julich (1999) Fahringer, T., Gerndt, M. Riley, G., Traff, J.: Formalizing OpenMP performance properties with ASL. In: Proceedings of the Third International Symposium on High Performance Computing ISHPC, pp. 428–439. 16–18 October (2000) Fürlinger, K., Gerndt. M.: Finding inefficiencies in OpenMP applications automatically with periscope. In: Proceedings of the 2006 International Conference on Computational Science (ICCS 2006), vol. 2, pp. 494– 501. Reading, UK (2006) Furmento, N., Mayer, A., McGough, S., Newhouse, S., Field, T., Darlington, J.: ICENI: optimisation of component applications within a Grid environment. Parallel Comput. 28(12), 1753–1772 (2002)

Achieving Guaranteed Performance in Computational Grids 16. Ganglia Information Provider. http://ganglia. sourceforge.net (2007) 17. Ghosh, D., Mukhopadhyay, R., De Sarkar, A., Mukherjee, N.: Study of the execution performance of parallel loops in OpenMP programs using different scheduling strategies. In: National Conference on Trends in Computing Technologies, Chennai, pp. 75–85 (2009) 18. Globus Toolkit 4.0. www.globus.org/toolkit. http:// antonio.jm.castro.googlepages.com/MasterThesis_ AntonioCastro_Final_Rev.pdf 19. Hussein, M, Mayes, K., Luján, M., Gurd, J.: Adaptive performance control for distributed scientific coupled models. In: Proceedings of the 21st Annual International Conference on Supercomputing, Seattle, Washington. 17–21 June (2007) 20. Jade Administrator’s Guide. http://jade.tilab.com/doc/ administratorsguide.pdf (2008) 21. Java Grande Forum. http://www.epcc.ed.ac.uk/ research/activities/java-grande/ (2009) 22. Jini. http://www.jini.org (2007) 23. Jini Tutorial. http://pandonia.canberra.edu.au/java/jini/ tutorial/Jini.xml 24. Kesler, J.C.: Overview of Grid Computing. MCNC, Research Triangle Park, NC (2003) 25. McGough, S., Cohen, J., Darlington, J., Katsiri, E., Lee, W., Panagiotidi, S., Patel, Y.: An end-to-end workflow pipeline for large-scale grid computing. J Grid Computing 3(3–4), 259–281 (2005) 26. Performance property specification in Julich Supercomputing Centre (JSC). http://www.fz-juelich.de/ jsc/kojak/performance_props/ (2008) 27. Roy, S., Mukherjee, N.: Utilizing Jini features to implement a multiagent framework for performance-based resource allocation in grid environment. In: Proceedings of International Conference on Grid Computing and Applications (GCA’06). The 2006 World Congress in Computer Science, Computer Engineering, and

131

28.

29.

30.

31.

32. 33. 34.

35.

36.

37.

Applied Computing, pp. 52–60. Las Vegas, 26–29 June (2006) Roy, S.: Performance-based resource management in computational Grid environment. Ph.D. Thesis submitted in Jadavpur University, Faculty of Engineering (2007) Roy, S., Sarkar, M., Mukherjee, N.: Optimizing resource allocation for multiple concurrent jobs in Grid environment. In: Proceedings of Third International Workshop on scheduling and Resource Management for Parallel and Distributed systems, SRMPDS’07, Hsinchu, Taiwan, 5–7 December (2007) Sakellariou, R., Zhao, H.: A low-cost rescheduling policy for efficient mapping of workflows on grid systems. Sci. Program. 12(4), 253–262 (2004) Scherer, A., Gross, T., Zwaenepoel, W.: An evaluation of adaptive execution of OpenMP task parallel programs. In: Proceedings of Languages, Compilers, and Runtimes for Scalable Computing (2000) Scimark benchmark. http://math.nist.gov/scimark2/ index.html (2004) Sun’s Jini. http://www.sun.com/jini/ (2007) Truong, H.-L., Fahringer, T., Dustdar, S.: Dynamic instrumentation, performance monitoring and analysis of grid scientific workflows. J Grid Computing 3(1–2), 1– 18 (2005) Wrzesinska, G., Maassen, J. Bal, H.E.: Self-adaptive applications on the Grid. In: ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PpoPP’07), pp. 121–129. San Jose, CA, USA, 14–17 (2007) Yu, J., Buyya, R.: A taxonomy of workflow management systems for grid computing. J Grid Computing 3(3–4), 171–200 (2005) Yu, Z., Shi, W.: An adaptive rescheduling strategy for Grid workflow applications. In: Proceedings of IPDPS, pp. 1–8 (2007)

Suggest Documents