Market-Oriented Multiple Resource Scheduling in Grid Computing ...

3 downloads 1243 Views 549KB Size Report
allocate proper computational resources to the jobs in an efficient way. The difficulties in the scheduling problem of Grid computing environments are: 2. 3. 4.
Market-Oriented Multiple Resource Scheduling in Grid Computing Environments Chia-Hung Paul Hsueh-Mtn Chang Von-Wun National Tsing Hua University, Taiwan (gump, pchang, soo}

Abstract In a Grid computing environment, each client has its own job represented as a workflow composed of tasks that require multiple types of computational resources to complete. Developing a mechanism that utilize limited schedules these workflows to amounts of resources in the Grid is a challenging problem. This paper takes a market-oriented approach allowing the job scheduling task to be distributed among clients. In this approach, several Agents plan a feasible schedule for their jobs and compete in the resource market. A Market Broker Agent is implemented to coordinate the conflicts in simultaneous access of the same resource. Experiment results show that the performance of the proposed approach surpasses those of first-come-first-sene and a variant of shortest-job-first method in terms of job completion ratio before deadline.

1. Introduction Drawing much recent attention, Grid computing is an emerging concept toward a computational infrastructure for resource sharing on the Internet. The computing [3, main proponents define the idea of 4, as the coordination of multi-site computation resources in the Virtual Organization A Virtual Organization is formed temporally for cooperation and resource sharing among several physical organizations to accomplish computational tasks. After the tasks are completed, the Virtual Organization can be dismissed. Although computers nowadays are becoming more and more powerful, in some special applications particular computational resources are still limited, such as the mass spectrometric analysis in 3D rendering computations. Those limited resources are usually quite expensive but also indispensable and irreplaceable for many applications, and therefore sharing those resources is essential. On the other hand, somejobs, such as a datamining job to analyze a large can be very

computation-intensive, and thus needs considerable computational resources to reduce execution time. Intuitively, the job can be sped up if it is divided into several smaller tasks that are distributed to multiple online computers to process. Then the total execution time can be reduced by making use of the idle resources and execute the sub-tasks in parallel. Since the requirements of both scenarios above involve the sharing of computational resources, a Grid computing environment can facilitate both types of jobs. With the development of standardized protocols, “the Grid” is forming as a global computational infrastructure utilizing a large numbers of computers. Nevertheless, the realization of the Grid has proved challenging in terms of protocol design, access control and resource management. In particular, an important challenge of the Grid is to develop a resource scheduling mechanism to allocate proper computational resources to the jobs in an efficient way. The difficulties in the scheduling problem of Grid computing - environments are: Dynamic environment: Both the resource servers and the clients can dynamically decide when to join or leave the Grid environment based on their own considerations. 2. Multiple resource types and alternative instances: Resources of multiple types could be selected together to serve to complete a given complex job. Each resource type can also have many alternative instances to be chosen in the Grid environment. 3. Combinational resource requirement: Each stage in workflow execution may need multiple resource units of multiple types to complete a given task. Therefore, the challenge is not merely to allocate all needed resources, but also to make sure clients get them at the same time. 4. Complicated execution of a workflow: A workflow is divided into several sub-tasks, some of which can be concurrent. Concurrent parts are allocated different resources in order to speed up

Proceedings of the 19th International Conference on Advanced Information Networking and Applications (AINA’05) 1550-445X/05 $20.00 © 2005 IEEE

the completion time of the workflow. Therefore, the execution of a workflow is no longer linear. 5. Parallel Execution: Similarly, the execution of each in a workflow could also be carried out in parallel in order to speedup the completion time. In traditional systems, the scheduling problem is usually solved with a centralized scheduler. However, centralization is often vulnerable and risky because of the single-node failure problem: the system fails if the host of the scheduler fails. In addition, load-balancing is also an issue for centralized schedulers. In contrast, this paper attempts to solve the Grid scheduling problem with a multi-agent [ mechanism inspired by market-oriented programming [2, 10, techniques, removing the burden of scheduling from a central node. In our multi-agent system, the Workflow Agents are responsible for planning a best possible resource allocation of a workflow schedule for the client-side users they represent, by competing and bidding in a computational resource market. The Market Broker agent in the Grid computing environment is responsible for resolving conflicts of the resource demands of each Workflow Agent. By distributing part of the computation of the scheduling task among the clients, this approach reduces the overloading problem of a centralized scheduler. The rest of this paper is organized as follows. Section 2 briefly surveys existing work related to the scheduling problem. Section 3 describes our models of workflows and the resource market. Section 4 introduces the multi-agent scheduling mechanism based on the models. Section 5 shows the experiment results compared to first-come-first-serve (FCFS) and a variant of shortest-job-first method. Section 6 concludes this paper.

2. Related work 2.1. The Condor project [6, being developed for about fifteen years, aims to provide a high-throughput scheduler to harness idle CPU cycles. Condor used the ClassAd mechanism involving clients, resource provides and the matchmaker. The matchmaker makes suitable resource allocations by matching users' requirements posted as advertisements against the advertisements of the resources. It will notify both user programs and resource providers for further negotiation, and the client submits the jobs to the resource server to execute it later. Condor-G is a recent effort to integrate Condor with Grid environments.

system is Although the successful in CPU scheduling and effectively reduces the computing time of the CPU-bound applications, it cannot schedule tasks that explicitly specify the need of multiple types of resources. For example, Condor's matchmaker cannot appropriately coordinate tasks that need both CPU and disk resource at the same time. However, the environment's capability of coordinating multiple resource types is important for complex Grid applications.

2.2. Simple, combinatorial and continuous auctions Market-oriented mechanisms, such as auctions, provide theoretical perspectives for resource scheduling. In the simple auction mechanism, every agent will select his favorite good to bid. Buyer agents follow some bidding rules -- like English auctions, Dutch auctions, and Vickery auctions etc -- to bid for their favorite goods, and an auctioneer determines which agent wins the good in the auction according to the type of the auction mechanism. Take English auction for example, every buyer agent increase the prices of their bids incrementally until no more agents will bid higher or the time is up. The auctioneer will allocate the good to the winner with the highest price bid. et al 11, has developed several applications based on auction mechanisms, but those methodologies are not directly applicable to Grid computing environment because of the need to coordinate multiple resource requirements. Multiple resource requirements can be regarded as a problem of dealing with a bundle of goods and the correlation among goods. Recently combinatorial auction 6 has raised much attention by mechanisms the correlations among goods into consideration. Generally speaking, in a combinatorial auction every agent could bid for their interested bundles of goods using a XOR-bid or other bidding methods. The auctioneer could collect all bids and select those bids that can maximize seller's revenue. However, combinatorial auctions have difficulty in computational complexity. It has been shown that winner-determination in combinatorial auctions is a problem [6, Therefore, applying the combinatorial auction protocol for job scheduling in large-scale Grid computing environments is not feasible because of the scalability issue caused by a potentially large number of clients. Besides, the high time complexity of combinatorial auctions also aggravates the problem of centralization. Since the Grid is a type of a distributed system, we intuitively don't want have a centralized sever responsible for the heavy jobs.

Proceedings of the 19th International Conference on Advanced Information Networking and Applications (AINA’05) 1550-445X/05 $20.00 © 2005 IEEE

The scheduling mechanism can be modeled as a continuous auction to avoid the problems above. In a continuous auction, clients bid for some resources without the need to specifying their correlation. After a fixed amount of time, the market clears, and the clients are ready for their next bid. The cycle iterates until each client is allocated all the resources they need. Although continuous double auction cannot guarantee an optimal solution, its auctioneer does not have the high time complexity as that of combinatorial auctions since the clients maintain the correlation of their goods through their bidding sequence. The agent scheduling mechanism of this paper is inspired by the protocol of continuous double auction.

For the market to reach equilibrium supply equals demand), clients and resource providers should be able to dynamically adjust the supply and demand. However, resource providers in a Grid computing environment do not often increase their quantity of resource supply unless they acquire new hardware Here, the resource market is regarded as reaching the equilibrium if the following constraint is satisfied:

3. The computational resource market model

Since resource supply does not change often, Grid clients will often need to adjust their resource requirement. Specifically, there may not be enough resources to satisfy all tasks when multiple Grid clients want to schedule their tasks. If the clients execute their tasks anyway, some of them will fail. Thus, the clients put their tasks attached with a price into the resource market as bids. Note that the whole task, rather than individual resource units of the task, is attached with one price, because the task is not executable if any of the require resource is not won. When the resource market clears, tasks with the highest price will be allocated resources first, and then tasks with the second highest price are allocated resources, and so on. Since there are limited resources, the tasks with lower prices may lose the bid. In this case, the owner of these tasks must readjust their resource requirements by planning a new schedule, and make bids again.

3.1. Definition of workflows and resource commodities A workflow is composed of a set of tasks t2,. .., and a task is represented as (d, (r, where d is the duration of the task, is a resource type needed by this task, and a is the required amount of the resource. A workflow may impose execution constraints on the tasks. We adopt the concept of directed acyclic graph used by Condor. Except the first and last one, each task in a workflow has its successors and predecessors. We define the successors and predecessors as follows: =

t, finishes]

t, =

t,

: theresourcequantity bat workflow agenf requires : the

that the resource can provide

executes}

33. Bidding price calculation The workflow also comes with a deadline before which all tasks are expected be completed. After defining the workflow, we also make some assumption about the resource. For every resource instance, only one task can access the resource instance at one time. A Grid computing environment might have many instances of computational resources regarded as goods in the market. We define a unit of commodity as the pair of a resource type and a time slot. For example, (CPU, 1) represents the commodity of the resource type CPU at time 1. There may be several units of the same commodity in the market. For simplicity we treat those instances as if they have the identical resource capabilities. For example, there can be two (CPU, 1) commodities in the market, but they are treated indifferently by the clients.

3.2. Market clearing

Grid clients must calculate the prices for their tasks in order to participate in the resource market. For the scope of this paper, a heuristic function for generating prices for tasks in the workflow is designed to increase the chance of winning all required resources so that the workflow can be completed before its deadline. Before the prices are calculated, a depth-first search algorithm traverses the workflow and calculates for each task the minimal remaining time needed to finish the workflow after that task is completed. Let the time required to execute the part of the workflow trailing task t be One simple heuristic used here is that a task is more urgent if the earliest possible time to finish the workflow after the execution of that task is closer to the deadline. The more urgent tasks are given higher prices. Another heuristic is that tasks using a lot of scarce resources have a lower price because allocating early resources to them will possibly cause many other

Proceedings of the 19th International Conference on Advanced Information Networking and Applications (AINA’05) 1550-445X/05 $20.00 © 2005 IEEE

workflows to be delayed. The price calculation function is defined as follows: = (-1) ( d - trailing (t) -

(

t))

))

i=l

d : the deadlineof the workflow : a resource type

) :mapping resource : the quantity of

to a constant

required by taskt

4. Multi-agent system architecture for he resource market Figure 1 shows the multi-agent system architec for the computational resource market. Four types of agents in the system have been implemented:

,

__-

Figure 1. The system architecture. The Resource Provider Agent provides information about its status and available time slots to the Information Service Agent. The Workflow Agent is the Grid client that plans for a schedule for a work-flow and plans whenever needed. The Information Service Agent responds to queries about the availability of resources and their prices. The Market Broker Agent performs market clearing based on the market model and allocates resources accordingly. The execution flow of the system is described as follows. Each step below corresponds to the arrow in Figure 1 with the same number: 1. The Resource Provider reports to the Information Service Agent the number of each type of resource commodity it has.. 2. Each Workflow Agent acquires the information about resource availability from the information service agent.

3. Based on the information, the Workflow Agent plans a schedule for its workflow. Tasks in the workflow that has no predecessor and tasks whose all predecessors has been allocated required resources are selected as the eligible tusks for bidding. Other tasks are not eligible, because allocating resources to those tasks may cause a task to be executed before it successors, violating execution constraints. The Workflow Agent then calculates the prices for the eligible tasks. 4. The eligible tasks are submitted to the Market Broker Agent. 5. After a fixed amount of time, the Market Broker Agent collects all tasks from all bids. It then queries the Information Service Agent for latest information about resource availability against which the bids are matched. 6. The market clears with the following procedure: The Market Broker Agent sorts the tasks in descending order according to the price. Starting from the task with the highest price, the Market Broker tests whether there is still enough resource for the task. If it is the case, the Market Broker allocates the required resources to the task, and then passes to next task, until every task has been tested. 7. The Market Broker agent notifies each Workflow Agent whether their bid has won or not. 8. The Market Broker agent passes the allocation decision to the Information Service Agent to update the availability information. Each Workflow Agent marks the winning tasks as satisfied, and re-plans the rest of the workflow based on new availability information. It then chooses a new set of eligible tasks and starts another round of bidding from step 2, until all tasks are allocated required resources. 9. Now the completed workflow schedules can be executed without conflicts. Each workflow agent executes its workflow by contacting the Resource Provider to request resource access.

5. Experiment The setting of the simulation contains six manually specified workflows in the computational environment of 15 15 memory units, 10 network connections, 30 disks and 7 other resources. Each Workflow Agent randomly adopts one of the six workflows. Given the minimum workflow completion time t, the deadline for the agent that adopts that workflow is a random number between and Our market-based approach is compared to the FCFS and earliest-deadline-first (EDF), a variant of using the same combination of the workflows. The

Proceedings of the 19th International Conference on Advanced Information Networking and Applications (AINA’05) 1550-445X/05 $20.00 © 2005 IEEE

FCFS and EDF algorithms require a central scheduler that schedules resources for all workflows. The result is the average of fifty simulations.

schedules the tasks regardless of the deadline, and therefore has the worst performance. We also present the EDF method, which first schedules the workflow with the earliest deadline, for comparison. Although EDF takes the deadline into consideration, it still schedules one workflow at a time, not allowing the tasks belonging to different workflows to compete. Figure 2 shows that the performance of the based mechanism is better than EDF regardless of the system load.. Although our market model is not optimized for reducing delay time, the average delay time of our approach is still better than FCFS, but not as good as EDF, as figure 3 shows.

6. Conclusion and future work

3

6

9

12

15

18

21

24

27

Number of participatingworkflows

Figure 2. The average ratio of job completion before deadline compared to FCFS and EDF methods. 40 37.5 32.5

3

6

9

Number

12

,

18

21

24

27

I

30

participating workflows

Figure 3. The average delay time compared to FCFS and EDF methods. The main criterion in this experiment is the ratio of successful workflow completion before deadline. Figure 2 shows the results according to the first criterion. The performances of all methods drop when the number of participating workflows increase because the the amount of resources cannot support increasing demands. Note that none of the methods is an optimal solution; to compute an optimal solution would be an NP-complete problem. The FCFS method

A multi-agent prototype for market-oriented multiple resource scheduling has been implemented to deal with the scheduling problem in Grid computing environments, where jobs are complex workflows involving the use of multiple types and units of resources. In the prototype system, Workflow Agents maintains the internal consistency of the workflow and plans for optimal schedules by advance reservation, while the Market Broker Agent coordinates the resource access among Workflow Agents. The result is that all workflows are allocated required resources without conflicting with other workflows. One advantage of this multi-agent approach is to distribute the computation complexity among clients. Unlike centralized approaches that have a scheduler that plans schedules for all workflows, the Workflow Agents in this multi-agent approach plan for their own workflows. Simulations have been done and the experiment results show the feasibility of the approach of this paper. The market-oriented mechanism performs better than centralized approaches including FCFS and the earliest-deadline-first method in of ratio of successful job completion before deadline. Future work include two directions. In terms of performance, the scheduling mechanism need to be tested in complex scenarios where workflows of varying length and resource usage can join at any point of time. In terms of the system architecture, the scheduling system can be integrated policy-based access control mechanisms [9] to enforce resource allocation.

Acknowledgements This research is supported in part by MOE Program for Promoting Academic Excellence of Universities under grant number 89-E-FAO4-1-4 also by MOEA grant number 93-EC-17-A-05-S1-030.

Proceedings of the 19th International Conference on Advanced Information Networking and Applications (AINA’05) 1550-445X/05 $20.00 © 2005 IEEE

References D. Parkes, An Efficient Ascending Price Bundle Auction, Proceedings of the 1st ACM conference on Electronic commerce, 1999. Weiss, Multiagent Systems: A Modem Approach to Distributed Artificial Intelligence, pages 201-233, MIT Press, ISBN: 0262731312,2000. [3] I. Foster, C. Kesselman, J. Nick, S . Tuecke. The Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Integration. Open Grid Service InfrastructureWG, Global Grid Forum, 2002. [4] I. Foster, C. Kesselman, S . Tuecke, The Anatomy of the Grid: Enabling Scalable Virtual Organizations, International Journal of Supercomputer Applications and High Performance Computing, 2001. [5] I. Foster, C. Kesselman, The Grid: Blueprint for a New Computing Infrastructure, 2nd Edition. Morgan Kaufmann, ISBN:1-55860-933-4.2004. [6] J. Frey, T. Tannenbaum, I. Foster, M. Livny, and S . Tuecke, Condor-G: A Computation Management Agent for Multi-Institutional Grids, Journal of Cluster Computing, volume 5, pages 237-246,2002. [7] L. Hunsberger, B. Grosz, A Combinatorial Auction for Collaborative Planning, In Proceedings of the Fourth International Conference on Multi-Agent Systems, IEEE Computer Society Press, pages 151-158,2000. M. Cannataro, D. Talia, and P. Trunfio, Distributed data mining on the grid, Future Generation Computer Systems, volume 18, pages 1101-1112,2002.

Johnson, P. Chang, J. Bradshaw, M. Breedy, L. Bunch, S . Kulkami, J. N. A. Uszok, and Wun Soo, Semantic Policy and Domain Services: An Application of DAML to Web Services-Based Grid Architectures, AAMAS workshop on Web Services, 2003. M. Wellman, Market-Oriented Programming: Some Early Lessons, Market-Based Control: A Paradigm for Distributed Resource Allocation, World Scientific, 1996. M. Wellman, W. Walsh, Auction Protocols for Decentralized Scheduling, Games and Economic Behavior, volume 35, pages 271-303,2001. M. Wellman, A Market-Oriented Programming Environment and its Application to Distributed Multicommodity Flow Problems, Journal of Artificial Intelligence Research, volume 1, pages 1-23, 1993. R. M. Livny, and M. Solomon, Matchmaking: Distributed Resource Management for High Throughput Computing, Proceedings of the Seventh IEEE International Symposium on High Performance Distributed Computing, 1998 T. Sandholm, Algorithm for optimal winner determination in combinatorial auctions, Artificial Intelligence, volume 135, pages 1-54,2002. The Condor Project, P. Wurman, W. Walsh and M. Wellman. Flexible double auctions for electronic commerce: Theory and Implementation. Decision Support Systems, 24: 17-27, 1998 I. Foster. What is the Grid? A Three-Point July 20,2002. G. (ed.). Multiagent Systems. A Modem Approach to Distributed Artificial Intelligence MIT Press, 1999.

Proceedings of the 19th International Conference on Advanced Information Networking and Applications (AINA’05) 1550-445X/05 $20.00 © 2005 IEEE

Suggest Documents