Dynamic Provisioning of Resources in Data Centers

3 downloads 4478 Views 70KB Size Report
primary resource being considered is a server. A quality of .... request by an AEM for a replacement of a dedicated server that ... application environment being managed by the ..... Server. Resources in Hosting Centers,” In Proceedings of the.
Dynamic Provisioning of Resources in Data Centers Bradley Simmons, Angela McCloskey, Hanan Lutfiyya Department of Computer Science The University of Western Ontario London, Ontario, Canada bsimmons,anmcclos,hanan @csd.uwo.ca over provisions resources. However, the allocation of resources in this manner leads to underutilization of resources since there will be allocated resources that are not needed most of the time. An alternative solution is to have resources dynamically allocated as needed and have resources deallocated from an application environment when the demand for the applications in the application environment is relatively low. The deallocated resources can be used for other application environments. It is possible that at any point of time the resource demands from all application environments exceeds the resource supply. In this case the management system needs to make decisions about resource allocation. The basis for these decisions depends on policies where a policy is defined as any type of formal behavioral guide [1]. A change in policy should not mean a recompilation of the system. Kephart et. al [8] defined several classes of policies including utility function policies. A utility function policy defines a function that maps a possible state to a value. A goal policy specifies the properties of a desired state. For example, a goal policy may specify a desired response time for a specific application environment. An action policy is the more traditional form of policy which is essentially a rule i.e., if condition then action. The relationship between a utility function policy and a goal policy is that the goal policy specifies the desired value of state where the value is computed using a utility function. Most of the existing work on utility function policies focuses on evaluating the effectiveness of the different functions specified in the utility function policy. An optimization algorithm is used to determine the maximum (or minimum) value of a utility function. An example of a utility function is one where given a specific resource allocation, expected demand estimates and desired response time calculates a

Abstract Data centers make decisions about the allocation of resources to application providers based on policies. Policies are any type of formal behavioural guide and are input to the component that makes resource allocation decisions. This paper describes a system that use policies from different source e.g., SLAs (negotiated between the application provider and the data center), and business objectives and rules (from the data center). Decisions are based on a model that integrates all the policies together. Policy changes can be accommodated without recompiling the system. Key Words: Data Centers, SLAs, Policies 1. INTRODUCTION

A data center is a collection of computing resources shared by multiple third-party applications concurrently in return for revenue provided by the providers of the third-party applications. The subset of resources assigned to an application is an application environment. Application environments are logically separated from each other. In this paper the primary resource being considered is a server. A quality of service (QoS) requirement is defined as a non-functional requirement that can be expressed using metrics such as response time, availability, throughput and utilization. The computing resources that an application needs vary over time. For example, if an application is expected to have a response time of less than t seconds and if the expected workload is expected to increase or the expected response time is expected to decrease (as a result of a change to a QoS requirement) then the required computing resources will increase accordingly. One approach for ensuring the satisfaction of the computing needs of a particular application is to provide enough resources for the anticipated peak demand i.e., a static approach that

1

being able to satisfy all goal policies including those associated with an application environment. The GPM is responsible for allocating and deallocating data center resources to application environments. The GPM handles considerations related to resource contention between application environments and limiting the adverse consequences on the data center provider for missing SLA guarantees. Periodically each AEM sends its requests for additional resources to the GPM. This is referred to as the GPM’s periodic review. The number of additional resources is based on models implemented by the AEM used to determine the number of servers needed. Based on these calculations the GPM makes decisions about new resource allocations. If a server in an application environment becomes unavailable then the AEM for that application environment makes an immediate request for a replacement server. If an AEM does not get the resources requested then the AEM executes actions to deal with this situation. These actions are specified through action policies. Both the GPM and AEM will use a rule engine. The rule engine evaluates if cond then action rules in response to events (i.e., asynchronous requests for resources) The GPM uses an optimizer for periodic reviews. The primary focus of this paper is on the GPM. Data centers can be thought of in two ways: as utility providers of raw computational resource or as providers of computational components. The first model considers the data center as a utility provider (analogous to a hydro or gas utility) and this approach utilizes guarantees on metrics (i.e., bandwidth, utilization, CPU cycles, etc.) The second model, and the model assumed in this paper, sees the data center provider as the arbiter of actual resources (e.g., servers, light-paths, etc.). This approach is preferred due to its simplifying nature at the data center level (i.e., pushing complexity out to the application environment providers).

monetary value. The goal of the optimization algorithm is to determine a resource allocation that provides the highest monetary value. There are several issues not yet considered by the existing work. First, in a data center there will be multiple goal policies with a goal policy associated with each application environment and with the data center as a whole. Much of the current work does not address the issue of not being able to satisfy all goal policies. An example of this occurring is when there are not enough resources to support the demand from all of the application environments. Second, the satisfaction of a goal policy may be constrained by business criteria. For example, a goal policy for a data center may be to maximize profit. However, it may be considered a poor decision to satisfy goal policies of application environments classified at gold at the expense of taking away necessary resources already assigned to other application environments. A data center management system must be able to handle these issues and adapt to change in policies without recompiling the code. This is important to making the data center adaptive. 2. HIGH-LEVEL MANAGEMENT ARCHITECTURE

This section briefly describes a two-tier management framework for data centers first proposed in [13] composed of the following management entities: the Application Environment Manager (AEM) and the Global Perspective Manager (GPM). The AEM is responsible for managing resources within the context of a single application environment. This may include provisioning resources allocated to that application environment across various service classes. Different application environments will have different requirements for adapting to a peak in workload that requires additional resources that cannot be provided by the data center. Example adaptations include the following: (i) If a request for additional resources is not satisfied then do not process requests from a specific service class; (ii) If a request for additional resources is not satisfied then reduce the number of requests to be processed concurrently by the same amount for all service classes. Thus all service classes will have an equal reduction in the number of requests processed. These adaptations are examples of action policies. This is necessary to partially address the problem of not

3. CONSIDERATIONS IN RESOURCE ALLOCATION BY THE GPM

This section describes considerations in resource allocation by the GPM. Broadly speaking these are all a form of policy as defined in section I where a policy is defined as any type of formal behavioral guide.

2

3.1

SERVICE LEVEL AGREEMENTS

A service level agreement (SLA) is defined as an agreement made between one of the two parties about the service to be provided by one of the parties to the other. For a data center part of the agreement includes the specification of expected behavior of the service to be provided (similar to a goal policy) and penalties to be paid by the service provider if the service is not provided as expected (example of an action policy), and the cost to the receiver of the service if the service is provided as expected. An SLA between the data center provider and application provider assumes that the service being provided is resource provisioning. Examples of what can be included in an SLA include the following: 1. The minimal number of server machines of a specific type (identified by k) required by an application environment, AEi. This is denoted as AEi.mink. Different types of machines are needed for multi-tier applications. The requirements of a machine for a database server may be different from the requirements of a machine for a web server. 2. For a machine of type k there is an expected average availability expected of that machine. This is denoted by AEi.availk. It is assumed that availability is measured as a percentage of time that the server is available. 3. There are two types of costs. One cost is associated with a machine that is one of the minimal number of machines of a specific type (identified by k). This is denoted as AEi.pricek The other price is that to be paid for additional servers beyond the minimal number. This is denoted by AEi.additional_pricek 4. A penalty is paid by the data center provider to the application provider if one of the servers becomes unexpectedly unavailable for a specific amount of time i.e., if server of type k is unavailable for more than x minutes then pays AEi.penaltyk. 5. Let Rk denote the resource pool of machines of type k. If the most recently measured CPU utilization of machine j of type k in AEi, denoted by cpu-utilizationijk, is below some threshold (cpu-utilizationth ) then return machine j of type k to resource pool, Rk i.e., if cpu-utilizationijk < cpu-

6.

utilizationth then return machine j of type k from application environment i to to Rk. This may be part of an SLA since different application environments may have different threshold values. This is evaluated by an AEM in response to a periodic review. For application environment AEi if a server of type k is allocated then a server of type l must also be allocated. This is an example of a condition using multiple attributes. Examples 5 and 6 represent SLA rules.

3.2

BUSINESS OBJECTIVES

The allocation of resources in a data center should be based on business objectives. Simply stated examples of business objectives include minimize cost, maximize profit or maximize customer satisfaction. Attributes characterizing a business objective are referred to as business indicators. Business indicators are measurable. A business objective is more formally defined as a constraint on a business indicator or a target that the value of a business indicator should try to satisfy. There may be multiple business objectives. A weight is assumed to be assigned with each of these business objectives. A function of the business indicators is used to evaluate a state or possible state which is defined by the resource allocation. These functions are used to determine the effectiveness of a specific resource allocation. These functions are similar to utility function policies. A business objective represents the desired value of a utility function e.g., maximize profit. This is similar to a goal policy. This concept of business objective appears to be similar to that defined in [1]. 3.3

BUSINESS RULES FOR RESOURCE ALLOCATION

A data center may impose constraints on achieving business objectives. For example, a request for additional resources cannot be satisfied if the number of spares is less than a specified threshold. The reason is that the spares should be used if a server becomes unexpectedly unavailable between periodic reviews. These constraints are referred to as business rules. Two events can cause the GPM to consider changes to the resources allocated to an application environment. The first event corresponds to the

3

response to the request from the AEM to the GPM asking for needed resources in the next time period. The second event corresponds to an asynchronous request by an AEM for a replacement of a dedicated server that has gone down. The following represents several examples of business rules that may be considered by the GPM in determining resource allocation. Let AEi.numberassignedj denote the number of machines of type j assigned to application environment AEi. Let Rj denote the resource pool for machines of type j. The set R ={R1, R2 … RN }is the set of all spare resources. 1. If a message (between periodic reviews) is received from an AEM indicating that one of its dedicated machines is down then allocate to the application environment being managed by the reporting AEM a machine from the appropriate spare resource pool if the size of the pool is at least one. 2. For a periodic review the number of allocated type k servers for bronze application providers should be less than 20% of all available servers of type k in Rk. 3. The resource allocation done as part of a periodic review is constrained in that the size of any resource pool Rj should be greater than or equal to T i.e., sizeof(Rj) ≥ T except from 15:00 to 17:00 where it should be T’. This is more formally defined as follows: a. If 15:00 < current_time < 17.00 then sizeof(Rj) ≥ T. b. If (current_time < 15:00 OR current_time > 17.00) then sizeof(Rj) ≥ T’. Essentially this represents a condition that must hold true in the resource allocation made for a periodic review. 4. During a periodic review, priority in resource allocation is given to those application environments that do not have the minimum resources as specified in their SLAs. A rule representing this is the following: If AEi.number-assignedk < then AEi.mink then AEik.priority = “highest priority” where AEi.number-assignedk is the number of machines of type k assigned to AEi and AEik.priority is the

5.

priority assigned to AEi for getting a machine of type k. A machine may only be assigned to one application environment.

Examples 1,2,4,5 are event-triggered if condition then action rules. The third example is a condition that must hold of the resource allocation made after a periodic review. 4. MAPPINGS

This section describes the approach used to map the information described in Section III to a form that can be used by the GPM. Examples are used to illustrate the mappings. The examples are based on the assumption of N resource pools and M application environments. In the first example described in III.C the event is the notification that a machine assigned to an AEM is no longer available. The condition is that the size of the resource pool associated with the needed server machine is greater than one. The action is that a spare is assigned to the application environment. This business rule can be mapped to a rule that is specified using the language used to specify the rules of the rule engine in the GPM. The SLA specifies penalties to be paid if an assigned resource has been unavailable. This also maps to a rule of the rule engine in the GPM. Let xijk = 1 denote that server j of type k is assigned to AEi . A machine returned to a resource pool can be considered for allocation to other application environments. This implies that variable xijk is created for each AEi for the optimization model. For the second example defined in III.C let B ⊆ AE be the set of all bronze application providers. The following constraint must be satisfied: ∑

i∈B

sizeof ( Rk ) −1

∑ xijk ≤ 0.20 * sizeof ( Rk )

j =0

The first rule specified in the third example in III.C can be mapped to the following constraint but only if the current time of the periodic review is between 15:00 and 17:00. In the constraint the variable reqk is the set of application environments requesting a server of type k.

sizeof(R k ) − ∑

i∈reqk

4

sizeof ( Rk ) −1



j =0

xijk ≤ T

provisioning of resources. The ability to specify actions provides limited support for defining rules. The application provider needs to specify the attributes of the resources needed e.g., for a server machine attributes include speed, memory and disk space. One specification language for this is the Resource Description Framework (RDF) [14]. Although it would be possible to use existing languages for our work it was determined that this would not be the best option since these languages are not designed for the purposes of this work. After a careful consideration it was felt that it would be easier to define a new language using XML for SLAs that focussed on data centers. The rules found in a SLA rule or a business rule can be specified based on the definition of rules specified in [10]. This is an XML DTD specifying the rules. This was modified to define an event. The event is essentially a condition on the state of the system. Resources are specified using attribute-value pairs. An XML element is used for each type of resource which is characterized by the attribute-value pairs. Attributes characterizing resources include CPU speed, memory and disk space. Other attributes include the attributes specified in III.A. The resource specifications are mapped to a type by the data center management software. Business objectives are specified by defining the attributes and the function to be applied to those attributes (utility function policy).

The fifth example rule specified in III.C states that high priority is given to an application environment that has fewer servers than that specified in the SLA. This has the affect of taking resources from the resource pool. One possible mapping is to first have the resources assigned to the application environments with the highest priority and then the rest of the resources are allocated among the rest of the application environment using the optimizer. This suggests that policies have an order of execution. Any rule that increases the size of the resource pools should be executed before solving the optimization problem. Any application environment whose priority is “highest priority” should have its resources allocated first. In section III.A an SLA rule (example 6) was defined that can be mapped to the following constraint1: ∑ xijk − ∑ xijl = 0

j∈Rk

j∈Rl

For the fifth example in III.C the following constraint is used for any machine. ∑ xijk ≤ 1

i∈reqk

Assume that the business objective is to maximize profit. For each xijk created cijk xijk is computed. The objective function is to maximize the sum of all the cijk xijk that were computed. This represents the maximization of the profit. The value cijk is AEi.additional_pricek. This is a simple utility function policy. A more complex utility function for determining cijk can be used and will be explored as future work.

6. PROTOTYPE

The Open Policy Management (OPM) framework is currently being developed to support policy-based management of data centers. The two primary management entities are the Global Perspective Manager (GPM) and the Application Environment Managers (AEM) which are both written in Java. Upon instantiation the GPM spawns two threads: the asynchronousListener and the synchronousListener. The start of a periodic review requires that the synchronousListener polls all application environment managers (AEMs). In response to this polling, an AEMessage is generated by the AEM for the GPM. An AEMmessage includes the following information: The number of additional resources (beyond the minimum) needed for a

5. SPECIFICATIONS

There have been several languages defined for the specification of SLAs (e.g., [6]). Most of this work does not consider the specification of penalties [9]. The emphasis is on the specification of service level of objectives and actions that can be taken if a condition becomes true e.g., a service objective is not being satisfied. The action can specify that a penalty is to be paid out. The service in this work refers to the

1

In our implementation we actually use less than one since this is what the solver expects.

5

particular type and the servers that can be returned to the spare resource pool. Within a GPM there is a PolicyEngine (which consists of a rule engine and the optimization module). Included in the inputs to the PolicyEngine are the SLAs, the business rules and business objectives. The PolicyEngine extracts the rules specified in the SLAs, penalties (which includes the condition under which a penalty is paid), costs and information about the resource needs. The PolicyEngine stores this information for each application environment. When an SLA is changed a message is sent to the PolicyEngine. The rules extracted from the SLAs and the business rules correspond to a class. Each of these classes inherit from a superclass called PolicyClass. Each of these classes have a method called map and a variable, priority, used to sort policies. The different subclasses of PolicyClass have different mappings to GPM functionality. For example, the first business rule described in III.C maps to an event-triggered if condition then action rule that is stored by the rule engine Upon receipt of the event if the condition is true, the rule is fired and the specified action is performed. The mapping will create the rule in the ABLE Rule Language (ARL)[5]. The other policies are mapped to an optimization model as described in section IV. The optimization is done by Frontline software (http://www.solver.com). The algorithm used is branch and bound. Although branch and bound can come up with an optimal solution it may not always be feasible to run it long enough to do so. The number of iterations can be limited allowing for faster execution but a semi-optimal solution. A periodic review causes the synchronousHandler to poll all AEMs. In response to this polling, an AEMessage is received. The information in an AEMessage includes the number of machines needed of a specific type. The GPM then uses the PolicyEngine to determine the resource allocations. The PolicyEngine checks to see if there are new policies that must be taken into account or if a policy has changed. This will either cause a change to a PolicyRuleClass object or the creation of a PolicyRuleClass object. This allows for a change of policies without recompiling the code. The optimization model is solved to determine resource

allocations. The GPM uses this information to create a sequence of operations that add and remove resources to and from the various application environments. This sequence of operations becomes input to Tivoli Provisioning Manager (TPM) [8] which carries out the operations. 7. RELATED WORK

Mathematical optimization algorithms have been used at run-time based on SLA specifications e.g., [ 11]. However, most of this work does not examine the role of other factors such as business rules in the development of the optimization model nor does this work study the role of these algorithms in a data center management system. Oceano [2] describes a data center that provides for dynamic provisioning of resources. Different customers may have different SLAs. The main focus is on the design and the implementation issues involved in moving servers between application environments and not algorithms for effective dynamic provisioning. Optimization models developed for data centers include [3,4,12]. Most of this work focuses on a specific set of constraints which do not necessarily correspond to business rules or SLA rules. There is almost no discussion on how to dynamically derive the optimization models at run-time. 8. DISCUSSION AND FUTURE WORK

This work described policies that can be taken into account in resource allocation. The management system is designed so that it can adapt to changes in policies without recompiling code. Future work includes the following: • Examine a broader class of policies. For example, highest priority is given to those application environments that fall below the minimum number of servers to be assigned to the application environment. However, this may not always be the most profitable approach. • Determine reasonable policies to be used if an application environment can asynchronously ask for additional resources. • Currently it is assumed that there are no conflicts in policies. Future work will look at the detection of conflicts in policies.

6

• Determine services needed to support the evaluation of the effectiveness of policies. • Determine how often the periodic review should occur. This partially depends on the time needed for a periodic review. More experiments are needed to evaluate this. • The scalability of the prototype needs to be evaluated. There are possible improvements to the PolicyEngine that can make it execute more quickly. The paper provided an integer linear programming model. This is in the current prototype. This can be used as the formulation of the problem using mixed integer linear programming. There exist techniques that can quickly solve a model in this form [11].

building multiagent autonomic systems”, IBM Systems Journal, Vol 41, No 3, 2002. 6. A. Keller and H. Ludwig, “The WLSA Framework: Specifying and Monitoring Service Level Agreements for Web Services”, Journal of Network and Systems Management, Special Issue on EBusiness Management, 11:1, Plenum Publishing Corp, March 2003. 7. J. Kephart and W. Walsh, “An Artificial Intelligence Perspective on Autonomic Computing Policies”, Fifth International IEEE Workshop on Policies for Distributed Systems and Networks (Policy 2004). 8. E. Manoel, S. C. Brumfield, K. Converse, M. DuMont, L. Hand, G. Lilly, M. Moeller, A. Nemati, and A. Waisanen, ‘‘Provisioning On Demand: Introducing IBM Tivoli Intelligent ThinkDynamic Orchestrator,’’ IBM Redbooks, December 23, 2003; see http://www.redbooks.ibm.com/abstracts/ sg248888.html 9. C. Molina-Jimenez, J. Pruyne and A. van Moorsel, “The Role of Agreements in IT Management Software”, Architecting Dependable Systems III, LNCS 3549, 2005. 10. Oasis, “The Cover Pages: ILOG Simple Rules DTD”, http://xml.coverpages.org/SRMLsimpleRules-dtd.html 11. C. Santos, X. Zhu and H. Crowder, “A Mathematical Optimization Approach for Resource Allocation in Large Scale Data Centers,”, HP Labs Technical Report, HPL-2002-64R1, HP Laboratories, 2002. 12. K. Shen, H. Tang, T. Yang, and L. Chu, “Integrated Resource Management for Cluster-Based Internet Services,” In Proceedings of the 5th Symposium on Operating Systems Design and Implementation, Boston, MA, USA, pp. 225-238, 2002. 13. B. Simmons, H. Lutfiyya, M. Avram and P. Chen “A Policy-Based Framework for Managing Data Centers”. In Proceedings of 2006 IEEE/IFIP Network Operations & Management Symposium (NOMS 2006), Vancouver, Canada, April 3-7, 2006. 14. W3C, “Resource Description Framework (RDF)”, http://www.w3.org/RDF.

9. ACKNOWLEDGEMENTS

We would like to thank the IBM Centre of Advance Studies and the Natural Science and Engineering Council (NSERC) of Canada. 10. REFEREMCES

1. S. Aiber, D. Gilat, A. Landau, N. Razinkov, A. Sela, S. Wasserkrug, “Autonomic Self-Optimization According to Business Objectives”, Proceedings of the International Conference on Autonomic Computing, 2004. K. Appleby, S. Fakhouri, L. Fong, G. 2. Goldszmidt, M. Kalantar, S. Krishnakumar, D. Pazel, J. Pershing, and B. Rochwerger, “Océano - SLA Based Management of a Computing Utility,” In Proceedings of the 7th IFIP/IEEE International Symposium on Integrated Network Management, Seattle, WA, USA, pp. 64-71, 2001. M. Aron, P. Druschel, and W. Zwaenepoel, 3. “Cluster Reserves: A Mechanism for Resource Management In Cluster-Based Network Servers,” ACM Sigmetrics, Santa Clara, CA, USA, pp. 90-101, 2000. J. Chase, D. Anderson, P. Thakar, A. Vahdat, 4. and R. Doyle, “Managing Energy and Server Resources in Hosting Centers,” In Proceedings of the ACM Symposium on Operating System Principles, New York, NY, USA, pp. 103–116, 2001. J. P. Bigus, D. A. Schlosnagle, J. R. Pilgrim, W. 5. N. Mills III and Y. Diao, “ABLE: A toolkit for

7

Suggest Documents