1 College of Computer Science, Zhejiang University, Hangzhou, 310027, P.R. ... trates on user-centric grid scheduling to better satisfy users' requirements.
Quality-of-Service Driven Visual Scheduling in Grid Computing Changqin Huang1,2, Guanghua Song1,2, and Yao Zheng1,2 1
College of Computer Science, Zhejiang University, Hangzhou, 310027, P.R. China 2 Center for Engineering and Scientific Computation, Zhejiang University Hangzhou, 310027, P.R. China
Abstract. To make full use of grid resources and to meet users’ requirements, efficient scheduling is a key concern in grid environments. Aiming at gridbased engineering computation applications, this paper proposes a Quality of Service (QoS) driven user-centric scheduling strategy. Firstly, degree of credit and degree of guarantee are defined, and aggregate utility ratio is modeled as a composite QoS; Secondly, for different types of grid users, two scheduling methods and steering-enabled visual interfaces are presented, respectively; Thirdly, four performance metrics and aggregate utility ratio are visualized to facilitate the user’s interaction with scheduling; Finally, corresponding postscheduling mechanisms are designed to cope with scenarios where scheduled tasks could not obtain expected QoS. This study is part of a grid project, MASSIVE, and the experiments show that the visual scheduling strategy presented is suitable for computational grids.
1 Introduction Grid computing is becoming a new computing infrastructure for scientific computing and cooperative works, and it promotes users’ collaboration through flexible and coordinated sharing of distributed resources. The performance that a grid can deliver varies dynamically due to resources competing, network status, task type, and so on. In order to improve the performance of a grid, it is necessary to provide applicable mechanisms that can perform effective task scheduling in the grid. Our investigation on existing grid scheduling methodology indicates the following two aspects: (1) Although many performance metrics are concerned, such as the system utilization, throughput, turnaround time, and waiting time, aggregate metrics are seldom considered, meanwhile, QoS of scheduling is only considered insufficiently; (2) Among a variety of scheduling methodologies, in general, scheduling mechanism is only regarded as a part of underlying infrastructure. That is, they are oriented to grid systems, not to grid users, and they don’t provide users with capabilities of more convenient steering. To overcome the weakness of the conventional grid scheduling, this study concentrates on user-centric grid scheduling to better satisfy users’ requirements. Our scheduling approach models an aggregate utility ratio as a composite performance metric, H. Jin, Y. Pan, N. Xiao, and J. Sun (Eds.): GCC 2004 Workshops, LNCS 3252, pp. 744–752, 2004. © Springer-Verlag Berlin Heidelberg 2004
Quality-of-Service Driven Visual Scheduling in Grid Computing
745
i.e. QoS, and the grid user’s requirements can be met by improving four performance metrics and the aggregate utility ratio. As the use of visualization is beneficial for understanding and analyzing computational information, visualization is utilized in our user-centric scheduling, to enable necessary interactions conveniently happen between users and the system. Furthermore, unexpected results of scheduling are remedied based on the thresholds of QoS during post-scheduling. This paper is organized as follows. The next section introduces the related works. Section 3 presents a composite user-concerned QoS model in scheduling, and gives definitions of four performance metrics. Visual scheduling framework is described in Section 4.While the automatic and manual visual scheduling interfaces, and QoS based visual steering are addressed in Section 5. Section 6 studies a post-scheduling mechanism with concern of QoS, and the last section outlines the conclusions and future work.
2 Related Works To meet system and users’ requirements in grid environments, a variety of scheduling strategies and algorithms are proposed. Buyya et al. [2] propose a scheduling algorithm with only concern of two performance metrics: cost and time. Cheng et al. [3] study the feasibility problem of scheduling a set of start time dependent tasks with deadlines and identical initial processing time, however, they set strict constraints (e.g. a single machine). Beaumont et al. [4] aim at the scheduling of independent, equal-sized tasks and improve the performance by making full use of a system metric (bandwidth). Based on time-varying resource prices, Dogan et al. [5] consider the problem of statically scheduling a set of independent tasks with multiple QoS requirements. He et al. [6] introduce the matching of the QoS request and service between the tasks and hosts based on the conventional Min-Min algorithm. However, the QoS is only concerned with the completion time, and scheduling is made between the two differentiated types: the high QoS tasks and low QoS tasks. Chen et al. [7] incorporate QoS management into Open Grid Services Architecture (OGSA) and provide a high-level middleware to build complex applications with QoS guarantees. The job scheduling is oriented to the service grid, and the QoS focuses on Success Ratio and In-Time Ratio. Abeni et al. [8] introduce a statistical guarantee of deadline based on inter-arrival and execution time probability distributions. However, it is more applicable to real-time system, than to grid environments. Chun et al. [9] present a scheduling approach based on resource markets and focusing on user-centric performance. Between the above studies and ours, there exist a difference: the formers aim at non-interactive scheduling in clusters, but not visual scheduling like ours. Islam et al. [10] provide QoS with the response time given by the end user in the form of guarantees of the completion time for submitted independent parallel jobs, however, they haven’t considered aggregate QoS and visual steering yet. Utilizing visualization is a good thought in grids. Shalf et al. [1] investigate the numerous issues of implementing grid-enabled distributed visualization, and advise a distributed visualization architecture. Whereas, visual steering of scheduling is not
746
Changqin Huang, Guanghua Song, and Yao Zheng
their emphasis. Jiang et al. [11] propose a rule-based visualization mechanism for a computational steering collaboration, allow users to extract regions of interests to visualize, and track and quantify the evolution of these features in grid environments. Our work gives a visual scheduling control and visual performance presentation. Bonnassieux et al. [12] concentrate on automated resource, service discovery and monitoring, and design a flexible grid visualization tool to represent all corresponding virtual views needed. However, scheduling and QoS are not considered in [11,12].
3 Composite Quality of Service (QoS) Model in Scheduling In practice, the popularization of grid applications relies greatly on grid user’s concerns, therefore user-oriented QoS is very important. At present, budget/cost and deadline/time have been introduced as parameters of QoS. Grid resources are normally highly dynamic and heterogeneous, whilst the tasks to be scheduled dynamically arrive for execution across Vos. Thereby, more performance parameters should be applied to reflect the actual characteristics of grids. Here, in addition to budget/cost and deadline/time, we introduce two metrics: degree of credit and degree of guarantee. Moreover, we define aggregate utility ratio as composite QoS, where the scheduler will make a dynamic schedule. The composite QoS model is shown in Figure 1. A user oriented QoS, in the form of aggregate utility ratio, is composed of four performance parameters with respective weights during composition. All these performance metrics affect the scheduling by information change with the scheduler, and all required values corresponding to a certain task can be inputted via a graphical interface. After a user’s task is scheduled, all values will be displayed to give users for a reference, and serve for the post-scheduling if necessary. The definitions of all performance metrics are given as follows. C ost V is u a l U s e r I n te r f a c e f o r P e r f o r m a n c e P a r a m e te r ( b u d g e t, d e a d l in e , d e g r e e o f c r e d it , a n d d e g r e e o f g u a r a n te e )
C o m p le tio n T im e
D e g r e e o f C r e d it
A g g r e g a te U ti lity R a t io
D e g r e e o f G u a r a n te e
S c h e d u le r
Fig. 1. A diagram of quality of service model and interaction relations.
Definition 1. Cost C is the amount of “money” based on a pay-in-use mechanism. Assume N denotes the number of used resource units. Let UT denote the used time of used resources for the task, and P associated “price” of one unit of used resource in
Quality-of-Service Driven Visual Scheduling in Grid Computing
747
one unit of time. Under the conditions of a uniform “money” unit, for a task of a certain user, cost C is defined as C=N*UT*P.
(1)
Definition 2. Completion time CT is defined as the wall-clock time at which nodes complete a certain task (after having finished any previously assigned tasks)[6]. Let AT denote the arrival time of the task, ST the starting time of the task, and ET the expected execution time. From the above definitions, we have CT = ST + ET.
(2)
Definition 3. Degree of credit DC denotes the success ratio of the actual service provided by resources across VOs. In this study, DC is only used for the entity of grid nodes, and it is gained by computing the historical information in activity profiles of nodes. Let TA denote the number of tasks once accepted, and TC the number of tasks completed under the constraints of users. Then DC is defined as DC = TC/TA.
(3)
Definition 4. Degree of guarantee DG denotes the probability of task completion before the deadline. Let ET denote the expected execution time, ST the starting time of the task, D the deadline of this task, and P the accurate ratio of predicted execution time. Then DG is defined as DG =( P-P(D-ST)/ET) /(1-P), if D-ST ET; otherwise DG =0.
(4)
In the above definitions, there are two concepts to denote the uncertainty of a grid: degree of credit DC and degree of guarantee DG. The expected execution time ET is a prediction value (produced by performance predictor), and the used time of a certain resource UT is a pre-scheduled time with respect to the pair of this resource and an associated task by the scheduler. Let B denote the user budget, then we can define the composite quality of service (Aggregate Utility Ratio, AUR) as follows AUR=¬1( B/C) *¬2( D/CT )*¬3 DC*¬4 DG
(5)
In Eq. (5), ¬1,¬2,¬3 and¬4 stand for the weights of the four performance factors, and they can be set by the administrator or specified by the VOs.
4 Visual Scheduling Framework To improve the quality of scheduling and to provide users better control of steering, the scheduling could be performed in a visual manner, through two types of visual windows, scheduling and QoS sessions. These visual methods provide users with a direct awareness and a friendly interaction. A visual scheduling framework is shown in Figure 2, with the following features.
748
Changqin Huang, Guanghua Song, and Yao Zheng
1. According to different types of users, the scheduling is performed in both manual and automatic manner with a simple monitoring mechanism. 2. In addition to the arrived task queue, a reserved task queue is used for arranging pre-scheduled tasks. Hence reservation-based scheduling can perform well. 3. The scheduling algorithms during the automatic scheduling can be selected by users, as various conventional algorithms can be integrated into the system. Moreover, there is further extensibility for inclusion of new algorithms. The idea is based on the fact that different algorithms are suitable for respective environments and objectives. 4. Performance predictor serves for the scheduler by predicting and computing the previously mentioned performance metrics. It analyzes the performances of the tasks to be scheduled in advance, furthermore, it evaluates all QoS parameters of scheduled tasks. The values are provided to users in a visual fashion. 5. Quality of service manager is responsible for accepting these required performance values from the input, for setting performance threshold values that are conditions of triggering adjustment mechanisms and warning user, and for managing the events of post-scheduling and the interactions for a better performance. Q o S S e s s io n
P erfo rm a n ce P r e d ic to r
A r r iv e d T a s k Q u e u e
Q o S S e s s io n
S c h e d u lin g A u to m a tic S c h e d u le r
M a n u a l S c h e d u le r
G r id R e s o u r c e s
Q u a lity o f S e r v ic e M anager
R ev ersed T ask Q u eu e
Fig. 2. A visual scheduling framework.
5 Implementation of Visual Scheduling We have developed a visual grid prototype system oriented to engineering computing, named MASSIVE (formerly VGrid [13]). This study is a part of the MASSIVE project and we adopt Globus Tools 2.4 as an underlying middleware and a development tool on Linux systems. The QoS model and the visual scheduling framework are implemented with the aid of the KDevelop package. All visual sessions are refreshed every certain interval or are triggered by associated events. Figures 3 and 4 show a manual scheduling session and an automatic scheduling session, respectively, where the following details are noticeable. 1. Both of the two visual scheduling sessions give an area, where the results of monitoring are displayed, and there are three operations: watch “RSL file”, “Reschedule” and “Submit to run”, by which interactions with users can happen.
Quality-of-Service Driven Visual Scheduling in Grid Computing
749
2. Tasks are scheduled under all constraints including diverse performance requirements, and performance predictor aids the scheduler to make decisions. In our study, a simple prediction module oriented to engineering computing is developed to serve that purpose. 3. In Figure 4, the right middle area is designed for steering these tasks in “Reserved Task Queue”. Visual steering for quality of service is shown in Figure 5, of which the following details are remarkable.
Fig. 3. A session of manual scheduling.
Fig. 4. A session of automatic scheduling.
Fig. 5. Visual steering for quality of service.
1. The button “SetupForAdministrator” is designed for some important operations to QoS management. For instance, all threshold values of performance metrics can be set via this entry point. 2. All values of four types of performance are presented in the form of percentage, and 100% denotes that it is best value in the viewpoint of the submitting user. Similarly, quality of service, as “QoSPercent”, indicates composite performance for the current selected task.
750
Changqin Huang, Guanghua Song, and Yao Zheng
3. Grid users can set the initial performance value in the corresponding “RequValue” area. In the row for “Time”, the input value is used as its deadline, and the input value is used as its expected “monetary” budget in the row for “Cost”. That is, the above two inputs are set with the corresponding actual values, and the rests are percentages. 4. Grid users can also set the initial value in the corresponding “KillTaskConditions” area to decide whether to cancel their scheduled task when a certain performance does not meet the given requirements.
6 Mechanism of Post-scheduling If there exist some troubles during the scheduled tasks’ execution, perhaps the above mentioned performances do not satisfy users’ requirements anymore. Thereby, the robust scheduling requires an excellent mechanism of post-scheduling. In this study, our basic scenario is: These running tasks will be progressively stopped when the values given by the predictor reach the threshold values or exceed the initial constraint values. Under the guidance of user-centric thought, the operations of postscheduling can be conducted in either manual manner or automatic manner. If users have set the corresponding “KillTaskConditions”, the system will firstly check these conditions, and if it matches one of them, the scheduled task will be killed, and associated node profiles are modified to affect the future performance prediction. Except for this previous case, post-scheduling performs one of the following actions, according to the set rules and the specified conditions. 1. Kill the task, release the current resource(s) and modify the corresponding profiles. 2. Let the task continue running on the current node(s), meanwhile, let this task run on one or many new nodes in parallel. If someone among these nodes or sets of nodes completes execution of the task, then the rests will cancel their tasks and release themselves. Modify the corresponding profiles. 3. Kill the task and release the current resource(s), meanwhile, let this task run on one or many new nodes in parallel. If someone among these nodes or sets of nodes completes execution of the task, then the rests will cancel their tasks and release themselves. Modify the corresponding profiles. 4. Kill the task and release the current resource(s), after that, put this task into “Arrived Task Queue” or “Reserved Task Queue” to let the scheduler reschedule. Modify the corresponding profiles. 5. Save the necessary information and migrate the task to one or more new nodes, release the old resource(s), and then let these new nodes perform in parallel. Lastly, modify the corresponding profiles. At present, we only implement the former four actions in the manual manner by coupling with the above scheduler. More issues of design and implementation of the post-scheduling will be studied in our future work.
Quality-of-Service Driven Visual Scheduling in Grid Computing
751
7 Conclusions and Future Work The user-oriented Quality of Service (QoS) is a key to popularizing grid applications. However, no integrated solution has been well addressed to meet users’ requirements during grid scheduling. In this paper, we have modeled a new composite quality of service and its associated performance metrics, such as degree of credit, and degree of guarantee, which progressively reflect the grid quality status. Aiming at the requirements of engineering computation applications, a QoS driven visual scheduling framework is proposed. For different types of grid users, two scheduling methodologies and steering-enabled visual scheduling interfaces are designed and implemented, respectively. Four performance metrics and an aggregate utility ratio improve users’ capability of steering QoS-driven scheduling in a visual fashion. Finally, corresponding post-scheduling mechanisms are designed to cope with cases where scheduled tasks could not obtain expected QoS. We have conducted some experiments in a test bed named MASSIVE. They show that this visual scheduling approach is suitable for computational grids. In the future, we plan to study the technologies of performance prediction in grid computing in the area of scientific and engineering computation, and to use further cases to test this visual scheduling prototype. Also, we are ready to study the automation of post-scheduling, migration of tasks, and recovery mechanisms in depth.
References 1. J. Shalf and E. W. Bethel: The Grid and Future Visualization System Architectures, IEEE Computer Graphics and Applications, 23(2),6-9, 2003. 2. R. Buyya, et al.: A Deadline and Budget Constrained Cost-Time Optimization Algorithm for Scheduling Task Farming Applications on Global Grids, Proc. of the International Conference on Parallel and Distributed Processing Techniques and Applications, 2002. 3. T. Cheng, et al.: Scheduling Start Time Dependent Tasks with Deadlines and Identical Initial Processing Times on a Single Machine, Computers & Operations Research, 30: 51-62, 2003. 4. O. Beaumont, et al.: Bandwidth-Centric Allocation of Independent Tasks on Heterogeneous Platforms. Proc. of the International Parallel and Distributed Processing Symposium, 2002 5. A. Dogan and F. Özgüner: Scheduling Independent Tasks with QoS Requirements in Grid Computing with Time-Varying Resource Prices. Proc. of Grid Computing-GRID 2002, 2002. 6. X.S. He, X.H. Sun, and G.V. Laszewski: QoS Guided Min-Min Heuristic for Grid Task Scheduling, Journal of Computer Science & Technology, 18(4): 442-451, 2003. 7. H.H. Chen, H. Jin, et al.: Early Experience in QoS-Based Service Grid Architecture, APWeb 2004, LNCS 3007, pp. 924–927, 2004. 8. L. Abeni and G. Buttazzo: QoS Guarantee Using Probabilistic Deadlines, Proc. of the IEEE Euromicro Conference on Real-Time Systems, 1999. 9. B.N. Chun and D.E. Culler: User-centric Performance Analysis of Market-based Cluster Batch Schedulers, Proc. of the 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2002.
752
Changqin Huang, Guanghua Song, and Yao Zheng
10. M. Islam, et al.: QoPS: A QoS Based Scheme for Parallel Job Scheduling, JSSPP 2003, LNCS 2862, pp. 252-268, 2003. 11. L. Jiang, H. Liu, et al.: Rule-Based Visualization in a Computational Steering Collaboratory, ICCS 2004, LNCS 3038, pp. 58-65, 2004. 12. F. Bonnassieux, et al.: Automatic Services Discovery, Monitoring and Visualization of Grid Environments: The MapCenter Approach, Across Grids 2003, LNCS 2970, pp. 222229, 2004. 13. G. Wei, Y. Zheng, et al.: An Engineering Computation Oriented Visual Grid Framework, GCC 2003,LNCS 3032, pp.51-58, 2004.