An Autonomic Network-Aware Scheduling Architecture for ∗ Grid Computing Agust´ın Caminero1 , Omer Rana2 , Blanca Caminero1 , Carmen Carrion ´ 1 1
Department of Computing Systems. The University of Castilla La Mancha. Campus Universitario s/n. 02071. Albacete. Spain. {agustin,
blanca, carmen}@dsi.uclm.es
2
Cardiff School of Computer Science 5 The Parade, Cardiff CF24 3AA. UK
[email protected]
ABSTRACT
1. INTRODUCTION
Grid technologies have enabled the aggregation of geographically distributed resources, in the context of a particular application. The network remains an important requirement for any Grid application, as entities involved in a Grid system (such as users, services, and data) need to communicate with each other over a network. The performance of the network must therefore be considered when carrying out tasks such as scheduling, migration or monitoring of jobs. Surprisingly, many existing QoS efforts ignore the network and focus instead on processor workload and disk access. Making use of the network in an efficient and fault tolerance manner, in the context of such existing research, leads to a significant number of research challenges. One way to address these problems is to make Grid middleware incorporate the concept of autonomic systems. Such a change would involve the development of “self-configuring” systems that are able to make decisions autonomously, and adapt themselves as the system status changes. We propose an autonomic network-aware scheduling infrastructure that is capable of adapting its behavior to the current status of the environment.
Grid computing enables the aggregation of dispersed heterogeneous resources for supporting large-scale parallel applications in science, engineering and commerce [10]. Current Grid systems are highly variable environments, made of a series of independent organizations sharing their resources, creating what is known as Virtual Organizations (VOs) [10]. This variability makes Quality of Service (QoS ) highly desirable, though often very difficult to achieve in practice. One reason for this limitation is the lack of control over the network that connects various components of a Grid system. Achieving an end-to-end QoS is often difficult, as without resource reservation any guarantees on QoS are often hard to satisfy. However, for applications that need a timely response (i.e., collaborative visualization [12]), the Grid must provide users with some kind of assurance about the use of resources – a non-trivial subject when viewed in the context of network QoS [15]. In a VO, entities communicate with each other using an interconnection network – resulting in the network playing an essential role in Grid systems [15].
Categories and Subject Descriptors C.2.4 [Distributed Systems]
General Terms Network QoS in Grid systems ∗This work has been jointly supported by the Spanish MEC and European Commission FEDER funds under grants “Consolider Ingenio-2010 CSD2006-00046” and “TIN200615516-C04-02”; by JCCM under grants PBC-05-007-01, PBC-05-005-01 and Jos´e Castillejo.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. MGC ’07, November 26, 2007 Newport Beach CA, USA Copyright 2007 ACM 1-XXXXX-XXX-X/07/11... $5.00"
We propose an autonomic network-aware Grid scheduling architecture as a possible solution, which can make scheduling decisions based on the current status of the system (with particular focus on network capability). Hence, the main contributions of this work are two fold: first, we have improved an existing network QoS architecture to consider the status of the network in a realistic way, in order to perform the scheduling of jobs to computing resources. Second, and more importantly, we have extended that architecture with autonomic behavior, so that it is able to adapt itself to changes in the system. We also propose models for predicting latencies in a network and in CPU, which constitute the base for the autonomic behavior of the architecture. The structure of the paper is as follows: Section 2 shows existing proposals for the provision of network QoS in Grid systems. Here we also mention some efforts on scheduling in autonomic computing. Section 3 shows a sample scenario in which an autonomic scheduler can be used and harnessed. Section 4 shows the solution we propose, a network-aware autonomic meta-scheduling architecture capable of adapting its behavior depending on the status of the system. In this section the main functionalities of the architecture are
presented, including models for predicting latencies in a network and in CPU. Section 5 draws conclusions and some guidelines for the future work.
2.
RELATED WORK
The proposed architecture is intended to manage network QoS, based on the status of a Grid system, and achieves this by means of applying autonomic scheduling. The provision of network QoS in a Grid system has been explored by a number of other research projects, such as GARA [15], NRSE [4], G-QoSM [3], GNRB [2], GNB [5] and VIOLA [19, 20]. They are briefly reviewed below. General-purpose Architecture for Reservation and Allocation (GARA) [15] provides uniform mechanisms for making QoS reservations for different types of resources, including computers, networks, and disks. These uniform mechanisms are integrated into a modular structure that permits the development of a range of high-level services. On the other hand, GARA has limitations when performing reservations across difference domains, as it must exist in all the traversed domains, which leads to scalability problems. The Network Resource Scheduling Entity (NRSE ) [4] suggests that signalling and per-flow state overhead can cause end-to-end QoS reservation schemes to scale poorly to a large number of users and multi-domain operations – observed when using IntServ and RSVP, as also with GARA [4]. This has been addressed in NRSE by storing the per-flow/per application state only at the end-sites that are involved in the communication. Grid Quality of Service Management (GQoSM) [3] is a framework to support QoS management in computational Grids in the context of the Open Grid Service Architecture (OGSA). G-QoSM is a generic modular system that, conceptually, supports various types of resource QoS, such as computation, network and disk storage. This framework aims to provide three main functions: 1) support for resource and service discovery based on QoS properties; 2) provision for QoS guarantees at application, middleware and network levels, and the establishment of Service Level Agreements (SLAs) to enforce QoS parameters; and 3) support for QoS management of allocated resources, on three QoS levels: ‘guaranteed’, ‘controlled load’ and ‘best effort’. G-QoSM also supports adaptation strategies to share resource capacity between these three user categories. The Grid Network-aware Resource Broker (GNRB) [2] is an entity that enhances the features of a Grid Resource Broker with the capabilities provided by a Network Resource Manager. This leads to the design and implementation of new mapping/scheduling mechanisms to take into account both network and computational resources. The GNRB, using network status information, can reserve network resources to satisfy the QoS requirements of applications. GNRB is a framework, and does not enforce any particular algorithms to perform scheduling of jobs to resources. Grid Network Broker (GNB ) [5] is aimed at providing network QoS in a single administrative domain, and is the only proposal which considers the network when performing the scheduling of jobs to computing resources. But the way this is done is not efficient, as it considers that only Grid transmissions go through the network, which is not true. Many of the above efforts do not take network capability into account when scheduling tasks. The proposals which
provide scheduling of users’ jobs to computing resources are GARA, G-QoSM and GNB, and the schedulers used are DSRT [6] and PBS [14] in GARA, whilst G-QoSM uses DSRT. GNB is the only proposal which considers the network, but as mentioned above it is not done in a realistic way. These schedulers (DSRT and PBS) only pay attention to the load of the computing resource, thus a powerful unloaded computing resource with an overloaded network could be chosen to run jobs, which decreases the performance received by users, especially when the job requires a high network I/O. Finally, VIOLA [19, 20] provides a metascheduling framework that provides co-allocation support for both computational and network resources. It is able to negotiate with the local scheduling systems to find and to reserve a common time slot to execute various components of an application. The metascheduling service in VIOLA has been implemented via the UNICORE Grid middleware for job submission, monitoring, and control. This allows a user to describe the distribution of the parallel MetaTrace application and the requested resources using the UNICORE client, while the remaining tasks like allocation and reservation of resources are executed automatically. A key feature in VIOLA is the network reservation capability, that allows the network to be treated as a resource within a metascheduling application. In this context, VIOLA is somewhat similar to our approach – in that it also considers the network as a key part in the job allocation process. However, the key difference is the focus in VIOLA on co-allocation and reservation – which is not always possible if the network is under ownership of a different administrator. Enhancing the core services of a Grid middleware with autonomic capabilities [7], to support self-management, is an important research area. There has been some work towards autonomic Grid computing [7, 11]. Liu and Parashar [11] present an environment that supports the development of self-managed autonomic components, with the dynamic and opportunistic composition of these components using highlevel policies to realize autonomic applications, and provides runtime services for policy definition, deployment and execution. In [7], an architecture to achieve automated control and management of networked applications and their infrastructure based on XML format specification is presented. In [1], an autonomic job scheduling policy for Grid systems is presented. This policy can deal with the failure of computing resources and network links, but it does not take the network into account in order to decide which computing resource will run each user application. Only idle/busy periods of computing resources are used to support scheduling. Thus, the proposal of this paper is a Grid scheduling architecture which considers the current status of the system, including the network and the computing resources, in order to carry out mapping of jobs to computing resources. Moreover, it merges Grid scheduling with autonomic computing in order to perform its task more efficiently.
3. SAMPLE SCENARIO The scenario is depicted in Figure 1. In this scenario, a user has a number of jobs, m, and requires computing resource(s) to run his jobs. The user will have some Quality of Service
Figure 2: Topology.
Figure 1: Scenario.
(QoS ) requirements, such as execution time, throughput, response time, etc. Each such QoS requirement could be represented as (a1 , a2 , ..., an ), where each ai is a constraint associated with an attribute (e.g. a1 : throughput > 5). Hence, a user’s job mi could be expressed as a set of constraints {aj } over one or more resources – expressed by the set R = {r1 , ..., rk }. Let us consider that the QoS requirement of the user is that all his jobs must be finished within one hour. This deadline includes the transmission of jobs and input data to the resources, and the delivery of results. In order to have his jobs executed within the deadline, the user will contact a broker, and will tell it his QoS requirements. The role of the broker is to discover resource properties that fulfil the QoS constraints identified in the job. The constraint matching is undertaken by the broker – by contacting a set of known registry services. Once suitable resources have been discovered, the next step involves establishing Service Level Agreements (SLAs) to provide some degree of assurance that the requested QoS requirements are likely to be met by the resources. When the user has already submitted his jobs to the computing resource, the broker will monitor the resource, to check whether the QoS requirements of the user are being met or not. The broker must decide when corrective action should be taken if particular QoS constraints are not being met. For instance, in our example, the user requirement is a one hour deadline; hence, the broker may consider that after 30 minutes half of the jobs should be finished (assuming that all the jobs are identical). If this is not true, and less jobs have been finished, then the broker may decide to allocate more resources to run those jobs. This way, the broker modifies the initial set of resources allocated to run this user’s jobs so that the deadline can be met. A key requirement in this scenario is for the broker to develop a predictive model identifying the likely completion times of the jobs that have been submitted by the user, and identifying whether the QoS constraints are likely to be violated. Consider the scenario where the deadline for a job cannot be met. In this case, the broker can attempt to migrate this job to another resource, but depending on the type of job (for instance, if this job cannot be checkpointed) we have lost the time up to this moment (if after 30 minutes we discover that the job will not meet the deadline and decide to move it, we have only 30 minutes left to have the job executed in the new resource within the 1 hour deadline). Therefore, when choosing resources, it is necessary to take
into account the characteristics of each resource and each job during the resource allocation process. Therefore, we propose the following alternative resource selection strategy. When the user submits his resource request to the broker, including his QoS requirements, the broker will perform a resource selection taking into account the features of the resources available at this moment. Also, features of the user’s jobs and his QoS requirements should also be taken into account. When the jobs have been allocated to a resource, the broker performs the monitoring. But, in the case that the broker infers that the requirement will not be met (after 30 minutes it discovers that the 1 hour deadline will not be met) it will proceed with a new resource allocation. Again, this resource allocation must be done taking into account the features of resources, jobs and QoS requirements. This scenario is focused on high throughput computing, where there is no dependencies among jobs.
4. NETWORK-AWARE SCHEDULING Our initial approach, whose structure was presented in [5], provides an efficient selection of resources in a single administrative domain, taking into account the features of the jobs and resources, including the status of the network. But the latter is not considered in a realistic way, as it considers that only the Grid transmissions flow through the network, which is not real. In this paper we improve this by considering a more realistic way of checking the status of the network. Also, we extend our previous proposal by means of considering autonomic computing. This means that our model will use feedback from resources and network elements in order to improve system performance. Our scheduler will therefore adapt its behavior according to the status of the system, paying special attention to the status of the network. Our scenario is depicted in Figure 2 and has the following entities: Users, each one with a number of jobs; computing resources, e.g. clusters of computers; routers, whose network buffers are managed by a policy, e.g. Random Early Detection (RED) [9] algorithm; GNB (Grid Network Broker ), an autonomic network-aware scheduler; GIS (Grid Information Service), such as [8], which keeps a list of available resources; resource monitor (for example, Ganglia [13]), which provides detailed information on the status of the resources; BB (Bandwidth Broker ) such as [16], which is in charge of the administrative domain, and has direct access to routers. BB can be used to support reservation of network links, and can keep track of the interconnection topology between two end points within a network. The interaction between components within the architecture is as follows: • Users ask the GNB for a resource to run their jobs. Users provide features of jobs and a deadline.
• The GNB collects the requests and after a given time interval, it performs two operations for each request. First, it performs scheduling of that request job to a computing resource, and second, asks the BB to perform connection admission control (CAC). The GNB uses link latencies provided by the BB to carry out the scheduling. The BB uses link latencies and feedback from resources and routers to complete the CAC. Link latencies are collected by means of ping packets which each router sends to all the entities directly connected to it. These pings are done by using small-sized ping signals (made of few packets), and a bound on the latencies of a real-sized transmission is then inferred (this is explained in Section 4.3). This way, we get real information on the current status of the network. Once the GNB has decided the computing resource where a job will be executed, it carries on with the next scheduling request. Feedback from resources consists of the number of dropped packets, the job arrival time and the number of jobs waiting in the queue. Using this information, it is possible to infer if the deadline is likely to be met. Router information includes details such as number of packets in router link buffers. • The GNB makes use of the GIS in order to get the list of available resources, and then it gets their current load from the resource monitor. • The GNB makes use of the BB in order to carry out operations requiring the network. The BB should implement some functions to calculate network path latencies, and to check if a transmission can be routed through a specific link. By making the BB have these functionalities, we assure the independence and autonomy of each administrative domain. Now, we will carry on with the explanation of the scheduling and connection admission control algorithms, which are performed by the GNB and BB, respectively. And after that, we will explain the calculation of links and CPUs latencies, and effective bandwidth of links.
4.1 Scheduler The GNB performs scheduling in rounds, which take place with a given frequency. Algorithm 1 explains how the GNB performs the scheduling. It needs to calculate a prediction for the latency for a job in each computing resource (line 4), and it is necessary to consider the network latency between the user and the resource, and the CPU latency of the resource. The former is calculated by the BB, as mentioned before, and it is detailed in Section 4.3, whilst the CPU latency is explained in Section 4.4. If the latency offered by a resource is greater than the job deadline, then the GNB discards that resource (line 6). Then, the remaining resources are ordered based on the latency of the job, from the resource with the smallest latency to the biggest (line 10). The result of scheduling is an ordered list of computing resources, from the best to the worst one.
4.2 Connection admission control (CAC) Once the scheduling has ordered the available resources, the connection admission control (CAC) algorithm is activated.
Algorithm 1 Scheduler algorithm. 1: Let u = a user, job’s owner 2: Let R = set of computing resources 3: for all ri in R do 4: latencyjob (u, ri ) = latencydisk(ri ) +latencycpu(ri ) +latencynetwork(u,ri ) 5: if latencyjob (u, ri ) > deadline then 6: discard (ri ) 7: end if 8: increment(i) 9: end for 10: Rmin = Ordered set {min(latencyjob (u, ri ))}, ∀(i)
Figure 3: CAC.
The CAC is carried out by the BB to keep independence and autonomy of administrative domains, as direct access to the routers is required to perform it. The overall algorithm is shown in Algorithm 2. We consider the computing resources from best to worst, which is the result of the scheduler. It is necessary to estimate the effective bandwidth between two end points in the network – these being the user, and the present computing resource from which an estimate is desired – and this is carried out link by link. Thus, for each resource, each of the links between the user and it are checked (line 2). Each link is analyzed to find out if it has enough effective bandwidth (line 3), as explained in Algorithm 3. If a link does not fulfill the effective bandwidth test, then the resource is discarded and proceed with the next computing resource (line 4). If the link fulfills the test, the next link in the path should be checked (line 6). If we have checked all the links in the path, and all of them fulfill the effective bandwidth test, then choose the current resource to execute the job (line 10). When the BB chooses a resource, GNB has to decrease the TOLERANCE for the pair (user location, resource) (line 14). This term will be explained the next. The term T OLERAN CE reflects how strictly we want the average buffer size of links to be under the lower Random Early Detection (RED) [9] threshold. Note that for the RED network buffer management algorithm, when the average buffer size of a link is between two thresholds, the next incoming packet has a probability of being dropped. Thus, if we want to keep the average buffer size strictly under the lower threshold, the TOLERANCE should have a low value. There is a TOLERANCE for each pair (users location, resource), which means that all the users from the same location share the same TOLERANCE for each resource. Also, the TOLERANCE changes in response to feedback information from resources. This means that if a resource detects a lot of dropped packets coming from a given location, the TOLERANCE for these users will be reduced, so that at a new scheduling round, they will have less prob-
Algorithm 2 CAC algorithm. 1: repeat 2: for all link lj in networkP ath(u, ri ), (j=1, i=1, ∀ri ∈ Rmin) do 3: if linkEf f ectiveBandwidthT est(lj ) != OK then 4: discard(ri ), increment(i), break 5: else 6: increment(j) 7: end if 8: end for 9: if (! discard(ri )) then 10: choose(ri ) 11: end if 12: until (choose(ri )) or (i == sizeOf(Rmin)) 13: if (choose(ri ) then 14: reduceTolerance(location(u), ri ) 15: end if
Algorithm 3 Link’s effective bandwidth test. 1: Let t = current transmission 2: Let l = the link being checked 3: Let T = set of transmissions whose next link is l, including t P 4: Let S = (previous link ef f ective bw(ti )), ∀(i) in T 5: if S > effectiveBandwidth(l) then 6: if (avg buffer occupation(l) + increment buffer occupation(t, l) < RED low threshold(l) * T OLERAN CE ) then 7: return OK 8: else 9: return NO OK 10: end if 11: else 12: return OK 13: end if
ability of using this resource. Also, users’ feedback affect the TOLERANCE, as they tell the GNB whether a job has been completed within its deadline. Thus, next scheduling rounds the broker will take this into account in order to accept transmissions to this resource. This way, the architecture applies autonomic computing features and adapts its behavior to the status on the system, based on feedback from users, resources and routers.
limit for the latency of a link. But this is only the current latency of a link, therefore to determine the future use of a link, we must use a predictive model. A predictive model can be based on how the Transmission Control Protocol (TCP) computes the value for the retransmission time-outs. For example, we can consider dif f = RT S − old RT T , and RT T = old RT T + δ ∗ dif f , where RT S is the last sample of RTT calculated following Equation 1, and δ reflects the importance of the last sample in the calculation of the next RTT [17]. Also, we can calculate the effective bandwidth for a link given its latency, by means of the following formula:
Algorithm 3 explains how each link is tested in order to find if a new transmission can go through it. We have to consider connections accepted in the current scheduling round which go through the current link, plus the one we are checking in this moment. Thus, for the former, we have to consider the effective bandwidth of the previous link they go through, and add them (line 4). If all those bandwidths plus the effective bandwidth of the connection we are checking at this moment is less than the effective bandwidth of the current link at this moment, then the connection is accepted through this link (line 12). For the connections shown in Figure 3 (where T includes all the connections at the left of the router), all of them add up to 1617 Mbps, whilst the current link provides 2.5 Gbps, thus the new connection is accepted at this link. If this is not true, we have to consider the available network buffer in the current link, which is based in the average buffer occupation of the link plus the increment due to the current connection (line 6). If the current link has enough free buffer, then the link passes the effective bandwidth test (line 7).
4.3 Calculating performance of links. The best way to calculate the latency of a link would be sending pings with real sizes (the size of the I/O files of the jobs), but this approach is not acceptable, because these ping messages would congest the network. An alternative approach would be for routers to send a few packets (much less than the real size of I/O files) to all the entities directly connected to them. This can be used to estimate the real values as follows: N pkts RT T 1 pkt RT T + (N − 1) ∗ T tx
(1)
where N is the size in packets of one of out I/O files, 1 pkt RT T is the round trip time of a one ping packet, and T tx is the time to transmit 1 packet. By doing this we can get a lower
link ef f ective bw =
pkt size 1 pkt RT T
(2)
In the same way as for the RTT, a prediction for the effective bandwidth will be calculated by considering the last sample and the current prediction.
4.4 Calculating CPU latency The CPU latency of a job in a computing resource is calculated by means of the following formula: CP U lat α
„
jobs already assigned + 1 cpu speed
«
∗ current load
(3) meaning that the CPU latency of a job in a computing resource is proportional to the jobs already assigned to the resource, the CPU speed and current load of the resource. jobs already assigned + 1 is the number of jobs already assigned to that resource by the architecture in the current scheduling round (assuming all the jobs are identical) plus the current job which is being checked, and cpu speed and current load are metrics obtained from a resource monitor. jobs already assigned refers to jobs that have not been already submitted to the computing resource, as the broker has not finished the scheduling round yet. current load is an average load, for example load f if teen, load f ive or load one metrics measured by Ganglia, thus it includes local load and Grid load. It can be turned into a percentage in the range [0, 1], meaning 0 totally idle, 1 totally busy. Also, a prediction on the next current load can be calculated similarly to the link latency and effective bandwidth
explained before. This formula is subject to changes during the implementation process, as we still have to check its correctness.
5.
CONCLUSIONS AND FUTURE WORK
Due to the inherent complexity of Grid systems, achieving large-scale distributed computing in Grids turns out to be an elaborated task. One way to tackle this problem is by means of using autonomic systems, which can change its behavior in response to changes in the status of the system. We have presented an architecture which merges Grid scheduling with automatic computing techniques in order to provide users with a seamless and self-healing system view. Our idea improves other Grid network QoS proposals as it considers the network when performing the mapping between jobs and computing resources. Also, the architecture presented here extends current autonomic job scheduling policies for Grid systems as it considers the status of the network when reacting to changes in the system. We consider the load of computing resources and the load of the network links, by means of predictions on the computing times and network latencies, in order to do that. Thus, the architecture provides scheduling of jobs to computing resources and connection admission control, so that the network does not become overloaded. This is carried out by two entities, GNB and BB. We are currently implementing the proposal using a simulation tool, GridSim [18], because a lot of work is required to set up the testbeds on many distributed sites. Even if automated tools exist to do this work, it would still be very difficult to produce performance evaluation in a repeatable and controlled manner, due to the inherent heterogeneity of the Grid. In addition, Grid testbeds are limited and creating an adequately-sized testbed is expensive and time consuming. Therefore, it is easier to use simulation as a means of studying complex scenarios.
6.
REFERENCES
[1] J. H. Abawajy. Autonomic job scheduling policy for grid computing. In 5th Intl. Conf. on Computational Science (ICCS), LNCS. Springer, 2005. [2] D. Adami et al. Design and implementation of a grid network-aware resource broker. In Proc. of the IASTED Intl. Conf. on Parallel and Distributed Computing and Networks, 2006. [3] R. Al-Ali et al. Network QoS Provision for Distributed Grid Applications. In Proc. of the Intl. Journal of Simulations Systems, Science and Technology, 2004. [4] S. Bhatti, S. Sørensen, P. Clark, and J. Crowcroft. Network QoS for Grid Systems. Intl. Journal of High Performance Computing Applications, 17(3), 2003. [5] A. Caminero, C. Carri´ on, and B. Caminero. Designing an entity to provide network QoS in a grid system. In Proc. of the 1st Iberian Grid Infrastructure Conference (IberGrid), 2007. [6] H.-H. Chu and K. Nahrstedt. CPU service classes for multimedia applications. In Proc. of IEEE Int. Conf.
on Multimedia Computing and Systems (ICMCS), 1999. [7] X. Dong, S. Hariri, L. Xue, H. Chen, M. Zhang, S. Pavuluri, and S. Rao. Autonomia: an autonomic computing environment. In Proceedings of the IEEE Intl. Conf. on Performance, Computing and Communications, 2003. [8] S. Fitzgerald, I. Foster, C. Kesselman, G. von Laszewski, W. Smith, and S. Tuecke. A directory service for configuring high-performance distributed computations. In Proc. 6th IEEE Symp. on High Performance Distributed Computing, 1997. [9] S. Floyd and V. Jacobson. Random early detection gateways for congestion avoidance. IEEE/ACM Trans. Netw, 1(4):397–413, 1993. [10] I. Foster and C. Kesselman. The Grid 2: Blueprint for a New Computing Infrastructure. Morgan Kaufmann, 2 edition, 2003. [11] H. Liu and M. Parashar. A component based programming framework for autonomic applications. In Proc. of the Intl. Conference on Autonomic Computing, 2004. [12] F. T. Marchese and N. Brajkovska. Fostering asynchronous collaborative visualization. In Proc. of the 11th IEEE Intl. Conf. on Information Visualization, 2007. [13] M. L. Massie, B. N. Chun, and D. E. Culler. The ganglia distributed monitoring system: design, implementation, and experience. Parallel Computing, 30(5-6):817–840, 2004. [14] G. Mateescu. Extending the Portable Batch System with preemptive job scheduling. In ACM, editor, SC2000: High Performance Networking and Computing, 2000. [15] A. Roy. End-to-End Quality of Service for High-End Applications. PhD thesis, Dept. of Computer Science, University of Chicago, 2001. [16] S. Sohail, K. B. Pham, R. Nguyen, and S. Jha. Bandwidth Broker Implementation: Circa-Complete and Integrable. Technical report, School of Computer Science and Engineering, The University of New South Wales, 2003. [17] W. R. Stevens. TCP/ IP Illustrated: The Protocols. Addison-Wesley, 1994. [18] A. Sulistio, G. Poduval, R. Buyya, and C.-K. Tham. On incorporating differentiated levels of network service into GridSim. Future Generation Computer Systems, 23(4):606–615, May 2007. [19] VIOLA : Vertically Integrated Optical Testbed for Large Application in DFN, 2006. Web site at http://www.viola-testbed.de/. [20] O. Waldrich, P. Wieder, and W. Ziegler. A Meta-scheduling Service for Co-allocating Arbitrary Types of Resources. In Proc. of the 6th Intl. Conf. on Parallel Processing and Applied Mathematics (PPAM), LNCS. Springer, 2005.