Autonomic Tuning Expert - A framework for best-practice oriented ...

8 downloads 43984 Views 1MB Size Report
best-practice oriented autonomic database tuning ... wards the automation of large information man- .... the exclusive use of concepts and tools from rela-.
Autonomic Tuning Expert – A framework for best-practice oriented autonomic database tuning David Wiese1, Gennadi Rabinovitch1, Michael Reichert2, Stephan Arenswald2 1

Friedrich-Schiller-University Jena, Germany {david.wiese, gennadi.rabinovitch}@cs.uni-jena.de 2

IBM Deutschland Research & Development GmbH, Boeblingen, Germany {rei, arens}@de.ibm.com application stack on top of the DBMS, and the application domain in general. Moreover, to a great extent database tuning is still performed reactively and manually. Because of database systems' dynamics, with respect to workload type or number of users, mainly reactive tuning leads to time delays between problem occurrence, detection, and resolution. In complex operative environments minimizing such delays is crucial. Autonomic Computing (AC), a well-known initiative of IBM® started in 2001, aims to reduce Total Cost of Ownership of IT systems by making software systems self-regulating. Routine administrative and tuning tasks are shifted towards the IT systems themselves, thus enabling the administrators to concentrate on long-term strategic decision making [11][10][17]. Analogous to the IT industry, the health care industry also recognized the need to increase service quality and reduce costs in the last decade. The health care industry started formalizing standard operating procedures (SOPs 1 ), i.e. typical approaches for diagnosis and treatment of patients. Several languages for formal representation of SOPs have been developed and workflow engines exist to execute formalized SOPs. In our opinion, the combination of AC concepts and the notion of SOPs, i.e. best-practice approaches, is very appealing. Plenty of information about database tuning best-practices is available online. The information resources range from traditional news groups 2 to IT blogs 3 , online

Abstract Databases are growing rapidly in scale and complexity. High performance, availability, and further service level agreements need to be satisfied under any circumstances to please customers. In order to tune the DBMS within their complex environments, highly skilled database administrators (DBAs) are required. Unfortunately, they are becoming rarer and more and more expensive. Improving performance analysis and moving towards the automation of large information management platforms requires a more intuitive and flexible source of decision making. This paper points out the importance of bestpractices knowledge for autonomic database tuning and addresses the idea of formalizing and storing DBA expert tuning knowledge for the autonomic management process. We will focus our attention on the development of a reference system for best-practice oriented autonomic database tuning for IBM DB2 and subsequently evaluate our system's tuning performance under changing workload.

1 Motivation Tuning and administration of database systems within their heterogeneous runtime environments is a complex and non-trivial task [6][13]. Achieved DBMS performance highly depends on individual skills of expensive DBAs who need to have deep knowledge about the DBMS itself, the Copyright  2008 David Wiese, Gennadi Rabinovitch and IBM Corp. Permission to copy is hereby granted provided the original copyright notice is reproduced in copies made.

1

http://www.openclinical.org http://www.dbforums.com 3 http://www.ibmdatabasemag.com/blog/main, http://database.ittoolbox.com 2

1

magazines 4 , IBM Redbooks® publications 5 , and product manuals. But this information is unstructured, i.e. not formalized, and hence not machine interpretable. As far as we know, there is currently no standard for representing database tuning bestpractices formally. Furthermore, to foster the development of such formalized database tuning best-practices, a community-based development approach seems to be most appropriate. Our longterm vision is to establish a kind of an ecosystem for developing and sharing formalized database tuning SOPs comparable to community-based software development projects like Eclipse6, Firefox 7 and internet sharing platforms like YouTube8or Flickr9. In this paper we describe an infrastructure called Autonomic Tuning Expert (ATE) developed within the scope of a CAS project (Autonomic DB2 Performance Tuning). ATE combines key AC concepts and the notion of database tuning SOPs to implement a feedback control loop for automating typical tuning tasks requiring minimal human intervention. The ATE infrastructure is designed to be the core component of an ecosystem that enables DBAs to design, exchange, adapt, and execute best-practice tuning methods. As a proof of concept we used ATE for tuning IBM DB2® for Linux®, UNIX® and Windows®. However, the concepts developed are applicable for tuning and administration of any database-driven software stack. The prototype’s effectiveness has been tested under TPC-C-like and TPC-H-like workload. Since in real-life dynamic workload mixes occur [29][8] we also considered an arbitrary sequence of both workloads with the goal of detecting workload shifts. The remainder of this paper is organized as follows: In Section 2, we review related work. Subsequently, in Section 3, we elaborate on the formalization of typical database tuning actions and encapsulating them into problem-specific tuning plans. Section 4 describes the architecture enabling workload-oriented problem detection and resolution in detail while a representative subset of tuning plans implemented so far is summarized

in Section 5. A presentation of evaluation results of tuning by ATE is given in Section 6. Finally, in Section 7 we conclude the paper with a brief summary and an outline of further fields of research.

2 Related Work Today's DBMS manufacturers have recognized the importance and necessity of providing assistance for typical DBA tasks and have made a first step towards (semi-)automation [9]. However, most DBMSs still seem to be situated at the Predictive Level according to the Autonomic Computing Maturity Model [10]. IBM’s Architectural Blueprint for Autonomic Computing defines a general architecture for building autonomic systems [14] that has become a base for many research approaches. Building on the IBM’s Architectural Blueprint, Powley et al. propose a general framework for the development of autonomic managers with the exclusive use of concepts and tools from relational database management systems (namely triggers and stored procedures), as well as reflective programming techniques. They consider controllable features along with current status of elements to be these elements' self-representation [20]. Benoit presents the idea of using a generic diagnosis system that relies on resource performance feedback for performance tuning. A diagnosis tree is created that is traversed in order to find resources that are not tuned properly. This tree incorporates the knowledge of database experts, as well as information gained from DB2 documentation [1]. Bigus et al. describe AutoTune, a generic agent for automated tuning, constructed by making use of the Agent Building and Learning Environment. They propose a controller architecture that learns the target system model and derives a controller from it. Furthermore, the architecture includes components for constructing, exploiting, and adapting a generic workload forecasting model [3]. Brown et al. explore the feedback-based approach of automatically adjusting resources in order to achieve a set of response time goals. However, they are restricted to DBMS multiprogramming levels and memory allocations [4]. Weikum et al. point out the need of automated feedback loop oriented memory management for heavy-load data servers. The paper gives

4

http://www.ibmdatabasemag.com http://www.redbooks.ibm.com 6 http://www.eclipse.org 7 http://www.mozilla.com/firefox 8 http://www.youtube.com 9 http://www.flickr.com 5

2

an overview of self-tuning methods for a spectrum of memory management issues [28]. None of the aforementioned approaches seem to formalize tuning knowledge. They are rather based on comprehensive, mainly hard-coded models, do not take DBAs' best-practices into consideration and make it almost impossible to incorporate further tuning knowledge. Furthermore, tuning seems to be limited to adjusting resource configuration parameters only. The health care industry tries to reduce costs and improve the quality of service by formalizing typical approaches for diagnosis and treatment of patients. Various methods to support the computerization of medical guidelines have been developed by the Medical Informatics community [19]. For example, Guideline Interchange Format (GLIF) [18] allows for a definition of SOPs as abstract flowcharts that subsequently can be enriched by decision logics and corresponding data structures. For execution of GLIF models approaches like Guideline Execution Engine (GLEE) [26], Guideline Execution by Semantic Decomposition of Representation (GESDOR) [27] or Online Guideline Assist (OLGA) [22] exist. However, while some of the knowledge formalization approaches seem to be adaptable for the domain of database tuning, the application domain of the execution approaches is slightly different. It focuses on support of human's actions by providing information based on current patients' diagnosis and complaints that still have to be provided to the system by humans at runtime. In a similar way our approach focuses on formalization of tuning knowledge and proposes a novel framework for autonomic database-oriented tuning enabling DBAs to encapsulate their bestpractices in formalized tuning plans at buildtime and automatically apply them to resolve problems based on problem-describing events by minimizing human intervention at runtime.

3.1 Problem Situations Every static (critical set of value combinations) or dynamic (change in system state) situation that may require an automatic reaction of the system can form an atomic event that occurs at a specific point in time. Examples of atomic events that are based on sensors are exceeded thresholds or specific performance metric values. In addition, user complaints can be treated as events as well. Complex events can be constructed recursively (through operators like conjunction, disjunction, sequence, negation, reduction, etc.) from atomic or complex events with the help of an event algebra [2][5].

3.2 Problem Resolutions Complex events are used to trigger tuning plans. A tuning plan is an encapsulated tuning workflow, consisting of a number of tuning actions and control structures, enriched with definitions of input and output parameters, as well as further metadata. It can informally be considered as a reaction to past, current, or upcoming problems pursuing one or many user-specified goals. Our event-plan subscription concept identifies all tuning plans associated with specific events (one to many relationships between events and tuning plans). At runtime, a tuning plan that best fits the event is determined and selected for execution (see Section 4.4). Basically, a tuning plan consists of a header, including metadata, and a body, containing the tuning algorithm. The separation of body and header serves for providing a fixed interface and for realizing information hiding. The contents of the body can be changed without modifying the corresponding header and does not need to be visible to the end-user applying the tuning plan. The header data can be split up into the following: • static maintenance data (title, description, goal of the plan, authors, creation/modification date, documentation); • dynamic runtime data (execution duration, number of executions, cost of execution, effects on resources/resource allocations, etc.); • associated events (event-plan subscription); • input and output parameters.

3 Formalizing DBA Knowledge Formalization and storage of tuning knowledge, comprising information about problem diagnosis and problem resolution, is crucial for a bestpractice based tuning system [21]. In this section, some concepts for the definition of problem situations (events) and problem resolutions (tuning plans) will be described.

Within our prototypical architecture (see Section 4) we focus on a limited subset of aforementioned

3

tuning plan metadata sufficient for our demonstrating purposes. Interface WIN

DB2_SYS

DB2_CLP

DB2_SQL

DBA PE

ATE

While the above mentioned interfaces foster extensibility of ATE, it seems reasonable to present tuning steps categorized by their semantics to users. In the following, a brief summary of such tuning step categories is provided.

Description Windows Operating System commands. E.g.: read and write environment variables/files; execute scripts (batch, ant, perl, etc.). System commands allowing access to and maintenance of the database manager. E.g.: db2start/db2stop, db2advis, db2expln, db2set. Command line processing commands allowing the execution of database utilities. E.g.: update dbm/db cfg, runstats/reorg/ (re)bind. SQL statements. E.g. connect, commit; create/access user defined functions and stored procedures; select/insert/update/ delete (DML); create/ alter/drop bufferpool/ index (DDL). Interaction with database and application administrators. Interaction with IBM DB2 Performance Expert via an API. E.g.: starting/configuring of exception processing. ATE interface for metadata access and (re)configuration of tuning system behavior.

System monitoring steps are used to obtain monitoring data from various sources or initiate data collection: • access DB catalog; • obtain snapshots and event monitoring data either from precollected data or by initiating immediate data collection (continuous, onetime); • SQL statements for monitoring purposes; • (de-)activate tracing/access trace files; • (de-)activate logging/access log files; • execute DB2-internal (db2eva, db2pd, etc.) and DB2-external (perfmon, iostat, etc.) tools; • access various data sources (IBM DB2 Performance Expert, Tivoli® Monitoring for Databases10, operating system, etc.). Furthermore, they should enable the temporary storage of collected data (create tablespace, create table, etc.).

Table 1: System component interfaces for tuning steps

Database design steps for creation or deletion of database objects and call of database design supporting tools: • verify existence of database objects (index, partition, tablespace, etc.); • create/delete database objects (create/drop index/materialized query table/partition/tablespace/bufferpool); • move table to tablespace/partition; • call db2expln, db2advis, db2look, etc.; • alter tablespace (add container, bufferpool mapping, etc.).

For the implementation of the body an arbitrary programming language can be used. However, providing a limited, well-defined, and database tuning specific set of steps, we minimize the complexity an administrator would have to deal with using some general programming language. The modularity of steps eases usage and enables an extensive reusability. Furthermore, as administrators are building their tuning plan with predefined steps, ATE can obtain semantic knowledge about the meaning of tuning plans, which is crucial for intelligent planning and execution at runtime (tuning plan selection and adaptation, consideration of upper/lower resource allocation boundaries, etc.). Tuning plan steps allow to generically access and work with database systems and their environments. Specific interfaces covering sensor and effector functionalities, examples illustrated in Table 1, have to be provided in order to allow the communication with a specific operating system, DB2, DBAs, and 3rd party tools, like IBM DB2 Performance Expert (PE) (Section 4.1). The additional level of abstraction constructed by accessing uniform interfaces within the tuning steps, allows the tuning plans to be applicable and generalizable onto any software stack.

Steps for configuration have to be applied to (re)allocate existing resources (such as sortheaps, bufferpools, locklist, etc.) or (de)activate mechanisms by obtaining configuration values and changing them: • get/set registry/environment variables; • get/set database configuration/database manager configuration; • call DB2 configuration advisor; • get/set bufferpool size; • static modification operation for encapsulating the get and set methods, allowing for the com10

http://publib.boulder.ibm.com/infocenter/tivihelp/ v3r1/index.jsp?toc=/com.ibm.itmfd.doc/toc.xml

4

fortable relative (percentage-wise, value-wise) configuration change that is semantically equivalent to new_ SHEAPsz=old_ SHEAPsz+100 or

While conventional tuning plans address performance problems of the monitored database, meta tuning plans can be used to (re)configure the ATE tuning system and thereby influence its behavior. They can refer to changes in the tuning plan execution history, as well as current and historic workload and system state data. Corresponding steps are based on the PE and ATE interfaces and implement the following functionalities: • modify PE thresholds and ACT correlations (see Sections 4.1, 4.3); • (de)activate tuning plans and events; • influence the decision logics for selecting the appropriate tuning plan in reaction to a recognized event; • modify resource restrictions (see Section 4.5).

new_ SHEAPsz=old_ SHEAPsz+5%

• dynamic modification operation, allowing to automatically adjust the step size depending on the current workload and recent configuration changes of the same parameter that is semantically equivalent to new_ SHEAPsz=old_ SHEAPsz+DYN_STEP.

Tuning decision support steps initiate the (collection and) interpretation/mining of data in order to support the decision making process within tuning plans (trend analysis, workload classification, etc.). Amongst others, SQL statements can be applied for analysis purposes.

As mentioned above, tuning steps can obtain and evaluate data. For that reason, data containers (variables and complex data structures, such as temporary tables) are indispensable. Furthermore, typical programming language control structures like loops, branches, and temporal constructs, such as waits (for specific events, data, or some period of time) are needed in order to support administrators in specifying tuning steps that will or will not be executed depending on results of data evaluation. In addition, we consider steps that allow for the explicit invocation of a tuning plan and triggering of events (implicit invocation of a tuning plan) within ATE as control structures as well. Although, their body might contain control structures, tuning plans are not supposed to implement whole feedback control loops. Hence, they are meant to be called iteratively, changing one aspect at a time. Thus, applying long-lived best-practices methods, it is attempted to improve performance stepwise, pursuing the well-known tuning principle to change only one aspect at a time, reducing the probability of side effects.

Typical maintenance tasks: • runstats, reorg, load/import/export, backup/restore; • DB2 system commands like db2stop, db2start; • commands like connect/terminate, commit/ rollback. User interaction steps allow to involve the DBA in the tuning tasks that cannot be performed automatically (e.g. because of semantic-related changes). These steps allow the tuning plan definer • to send notifications containing some requests or suggestions, such as a request to restart the DB2 instance in order to let some offline configurable parameters take effect; • to ask the DBA for a permission to proceed with some semantic-changing or high-impact configuration changes; • to provide the DBA with some information collected during the tuning plan execution and describing the problem situation in case the problem cannot be resolved automatically. Tuning plans, furthermore, support invocation of user-defined applications, which semantics is unknown to the ATE: • DB2-external tools, scripts, and services; • DB2-internal calls to execute user-defined functions or stored procedures.

4 Architectural Blueprint This section gives a brief overview of the architecture of the Autonomic Tuning Expert (ATE). In order to achieve results early, ATE’s architecture is based on widely accepted and influential technologies and industry-proven products like Eclipse, Generic Log Adapter, IBM DB2 Performance Expert, Tivoli Active Correlation Technology, and Common Base Events. These

With ATE metadata access steps, it is possible to obtain metadata of all ATE components and their objects' (events, tuning plans, logs) runtime and buildtime data.

5

products and technologies are combined in a way to build a component-based MAPE loop [14]. The MAPE loop consists of four parts – as depicted in Figure 1 – where the Event Collector realizes the monitoring phase, the Event Correlator corresponds to the analyze phase, the Tuning Plan Selector implements the planning phase, and the Tuning Plan Executor corresponds to the execution phase. There are also other components that do not belong to the MAPE loop, such as the Workload Classifier and the IBM DB2 Performance Expert, but that are an integral part of the overall architecture. All mentioned components are described in further detail in the following section.

thrown events. The Event Collector is based on the Generic Log Adapter (GLA) that is part of the Eclipse Test & Performance Tools Platform (TPTP). Although the GLA provides a comprehensive monitoring infrastructure for log files, its actual purpose is to convert various types of log information into a common standard format, the Common Base Event (CBE). The CBE specification is a widely used XML-based format for intercommunication between different enterprise systems that support logging, problem determination, or autonomic computing functions [12]. This means that nearly every message, independent of its format, can be transformed to and represented as a CBE. To support this, the GLA provides two types of adapters. The first one, the rule-based adapter, can be used to transform single line messages into CBEs using regular expressions. The second type, the static adapter, offers the possibility to process more complex events like those formatted as XML fragments. ATE implements a static adapter for the PE user exit output (see Section 4.1). ATE has been designed to support an arbitrary number of concurrent adapters as event sources. To deploy additional adapters, for example to monitor IBM Tivoli products or the IBM DB2 diagnostic log, either a new static adapter has to be implemented, or an adapter file for a rule-based adapter has to be provided. After incoming events from various adapter sources have been transformed into CBEs, they are sent to the Event Correlator.

4.1 DB2 Performance Expert IBM DB2 Performance Expert (PE) is a versatile DB2 database monitor that supports both, online database monitoring based on DB2 snapshots, and analysis of historical, aggregated performance data using a performance database (Performance Warehouse). PE’s online monitoring capabilities enable a DBA to analyze locking conflict scenarios, long running SQL statements, current database cache sizes, and general system information like disk space or physical memory utilization [7]. PE’s event and periodic exception processing feature supports DBAs in proactively monitoring database systems. A DBA is able to define threshold values for relevant key performance indicators (KPIs) like number of deadlocks occurred, bufferpool hit ratios or sort overflows. Moreover, the DBA has to decide how often the PE server should check for threshold value violations. If a KPI violates a defined threshold, the DBA gets notified about the problem situation. PE offers several ways to notify a DBA about the occurrence of an exception: sound alerts, visual alerts in the PE client GUI, e-mail notification, and invocation of a user exit routine. The user exit routine provides information about an exception situation in XML format to an arbitrary consumer. The ATE framework facilitates a small C program to receive the exception situation information from PE and feed it into the MAPE loop.

4.3 Event Correlator The Event Correlator deploys the IBM Tivoli Active Correlation Technology (ACT) [2] that is an engine for filtering and correlating events. ACT supports various input event formats. Among others, CBE events can be used out of the box. As stated in Section 4.2, the Event Collector (GLA) transforms source event messages into CBEs, which therefore can be processed by the ACT engine immediately. The Event Correlator provides a FIFO queue that stores every incoming event. As the queue is being filled up, the events are consecutively processed by ACT. ACT uses so-called rules to filter events, or to correlate and aggregate event sequences into complex events. There are different types of rules as described in [2]:

4.2 Event Collector The Event Collector component is capable of continuously monitoring different event sources for

6

Figure 1: ATE Architectural Blueprint • Filter Pattern: This pattern creates a correlated event each time an event is found in the input queue that matches the rule definition. If a match is found, an arbitrary action is taken as specified in the rule. This is the only pattern not aware of its state. All other patterns are stateful. • Collection Pattern: As the name suggests, events are collected over a specified period of time. At the end of the time period, a complex event containing all incoming events is created. This way, an arbitrary response action has access to all collected events. • Duplicate Pattern: This pattern is similar to the Collection Pattern. The difference is that the first incoming event creates the correlated event and subsequent incoming events of the same type are ignored for the duration specified in the pattern. • Computation Pattern: This pattern is another near relative of the Collection Pattern. Each time a new event arrives, this pattern applies a computation function. • Threshold Pattern: This pattern creates a correlated event if the number of event occurrences exceeds a certain threshold within a specified time window. • Sequence Pattern: This pattern creates a complex event after a predefined sequence of various events occurred, either ordered or unordered. • Timer Pattern: A complex event is created after a specified period of time.

With help of the rule patterns provided by ACT it is possible to model real world database performance problems and real world DBA tuning behavior very precisely. First of all, ACT reduces the event flood since only incoming events, which are referenced in a rule pattern, are recognized and processed. This behavior is comparable to an experienced DBA who primarily acts on problem notifications for which he knows a tuning action (tuning plan). Note that the Event Correlator works totally different than the Event Collector, which processes every incoming event. Secondly, patterns like the Collection Pattern and the Duplicate Pattern prevent the ATE framework from hasty overreactions to problem notifications. Like a DBA, the ATE framework selects a tuning plan for the first problem notification ignoring any subsequent problem notification of the same type for a certain amount of time. Thus ATE can comply with the golden rule of tuning "One change to the system at a time" and has time to validate what effect a chosen tuning plan has to the state of the monitored system. Finally, the Threshold Pattern prevents reactions to singular problem notifications. Singular problem notifications are mainly ignored by DBAs, but if a certain problem occurs over and over again, DBAs pay attention to it. As already mentioned, ACT only processes incoming events if they are explicitly referenced in a rule. The ATE framework extends this behav-

7

ior by implementing database workload aware rule patterns. In other words, an incoming event is only processed by ACT if the event is referenced in a rule and if the current workload of the monitored database matches the workload type of the rule. This additional mechanism is described in more detail in Section 4.6. Every time the ACT engine detects a complex event, a rule response is created. This response consists of the original CBE, which initiated the rule, and the rule name (rule identifier). The original CBE already contains information like database name, names of affected objects, etc., that is provided by PE. The ATE framework also enriches the rule response with meta information like the recognized workload on the monitored database system. The rule identifier of the detected complex event is used by the Tuning Plan Selector to determine which tuning plan to select for execution. No new event from the input queue is processed until the tuning plan chosen by the Tuning Plan Selector has been executed.

time. This model supports the definition of upper and lower boundaries that depend on the amount of physical memory available. It is used to avoid values that are too high or too low for resource parameters.

4.6 Workload Classifier Tuning a database system heavily depends on the current workload type. A DBA has different – sometimes disjoint – tuning goals in mind when dealing with several workload types. For example, a database system configuration suitable for one workload type might be totally insufficient for another workload type. On the other hand, trying to use one database system configuration for all workload types usually results in suboptimal performance due to insufficient resource utilization. To address this problem, a general classification framework has been implemented and integrated into ATE. The classification framework has the following features: • definition of classification models; • data collection for classification models; • classifier learning; • classifier deployment.

4.4 Tuning Plan Selector Tuning plans together with some meta information are stored in the Tuning Plan Repository. The Tuning Plan Selector is responsible for retrieving a tuning plan that best resolves the problem associated with the incoming rule response sent by the Event Correlator. After a proper tuning plan is identified, the Tuning Plan Executor is invoked. The problem context indicating CBE is passed along ensuring that the knowledge about the problem itself is also available for tuning plan execution.

A classification model is a set of arbitrary columns of IBM DB2 Performance Expert history snapshot tables (see also Section 4.1). The classification framework copies the values of the classification model columns for a specified period of time (number of snapshots) to database tables maintained by the classification framework. After enough data is collected, the copied snapshot data is then post-processed (cleansing, delta-processing) and tagged with a class label. IBM Intelligent Miner™ is used to learn a classifier tree from the collected and tagged performance data. The result of this learning process is a DB2 stored procedure implementing a classifier tree that can be called from any DB2 application to classify data that complies with the classification model. The classification framework can be used in a variety of ways, for example, to develop classifier trees that can be used in tuning plans. Primarily, the classification framework is used to learn a database workload classifier. This classifier is the core of ATE’s Workload Classifier (WCL) component that is used to classify the current workload of the monitored database.

4.5 Tuning Plan Executor This component is responsible for executing the tuning plan chosen by the Tuning Plan Selector. Only one tuning plan is processed at a time to avoid two parallel tuning plans that have conflicting effects. Nevertheless, future versions might support parallel execution of tuning plans to increase tuning plan throughput and thus improve responsiveness of database tuning. In addition, tuning plans changing numeric database or database manager configuration parameters, such as sortheap, logfilsiz, or query_heap_sz are constrained by a simple resource restriction model that is evaluated at run-

8

It is worth to point out that the classification model of the workload classifier facilitates system and tuning independent database performance metrics only. This guarantees that the classifier is independent of the computer system the monitored database is hosted on and that the tuning process itself has no impact on the classification result. At the moment, ATE’s Workload Classifier component is able to distinguish two workload classes: OLAP (TPC-H [25]) and OLTP (TPC-C [24]) workloads. However, further workload classes could be integrated with little overhead. The Workload Classifier sends a CBE event containing information about the current database workload type to the Event Correlator as soon as a workload type shift is detected. The Event Correlator uses the workload type information to manipulate the global ACT engine state to support workload-aware rule patterns and to be able to pass workload type information to the Tuning Plan Selector and Tuning Plan Executor. Besides, the ACT engine generates a correlated event (rule response), which in turn triggers a meta tuning plan (see Section 3.2) responsible for dynamically adapting the periodic exception threshold set used by PE. The WCL ensures that tuning plans and tuning steps are selected depending on the current workload type and that the most accurate set of threshold definitions for the current workload is activated in PE.

leverage system-internal knowledge for autonomic resource tuning, we use resource tuning mainly as a proof of concept for our complementary black-box oriented tuning approach. This section describes a representative subset of the implemented tuning plans in more detail. Simplified source code is provided to demonstrate the expressiveness of the tuning steps proposed in Section 3. As described in Section 4, problems are detected by the IBM DB2 Performance Expert as atomic threshold based events, correlated to complex events and finally mapped to corresponding problem resolving tuning plans at runtime. In order to support workload-dependent exception processing we define a set of thresholds per workload type. In the following, the workload dependency of IBM DB2 Performance Expert thresholds is expressed as: metric [operator, (warning value / problem value)OLTP, (warning value / problem value)OLAP]

Each PE KPI maps to an aspect of health for a particular database object (bufferpool, sortheap, cache, etc.). Workload-specific warning and problem threshold values set upper or lower boundaries on the KPI to define normal, warning, and problem ranges, depending on the operator (, =). While actions addressing the problem state should be taken immediately, there are several possible strategies to react to warnings. Predictive actions, sophisticated further root cause analyses, as well as notification to the DBA are imaginable.

5 Description of implemented Tuning Plans

5.1 Index Design Atomic Event: avg num of rows read per selected row

During our research best-practice oriented problem diagnosis and resolution knowledge has been extracted from official DB2 documentation, white papers and Redbooks publications, and captured in tuning plans along with triggering events. Thereby, implemented tuning plans cover the areas of online resource allocation and configuration (heaps, caches, locklist), physical database design, as well as statement tuning. The implemented tuning plans dealing with manipulation of resource configuration parameter values are not very sophisticated and may result in resource over-allocation in some cases. We try to mitigate this by a simplified resource restriction concept as described in Section 4.5. Since most DBMS already provide efficient built-in mechanisms that

[>, (5/10)OLTP, (25/50)OLAP]

Complex Event: DUPLICATE(avg num of rows read per selected row, 5 min)

Tuning Plan: Once the user-defined threshold of “avg num of rows read per selected row” is exceeded (PE event), a duplicate rule is activated and results in the execution of the corresponding tuning plan. All subsequent PE events of the same type are ignored for 5 minutes, as specified in the rule definition. Immediately after the tuning plan execution initiation, it accesses all necessary information provided within the CBE (database name and workload type in lines 1-2, Listing 1).

9

1 2

db_Name = ATE.get_Param_Value("db_name"); workload = ATE.get_Param_Value("workload");

5.2 Long Running Queries

3 4 5 6 7 8 9

myCon = DB2_SQL.Connect_DB("DB2PE"); Table result = DB2_SQL.Mon_SQL("SELECTmax(AVG_NUMBER_ROWS_ SELECTED_ROW), stmt_text from db2pm_9.dynsql where DB_NAME='"+db_Name+"' AND UCASE(stmt_text) not like '%SYS%' AND UCASE(stmt_text) like 'SELECT%' AND interval_to >= (current timestamp 5 minutes) AND AVG_NUMBER_ROWS_ SELECTED_ROW is not NULL group by stmt_text order by 1 desc fetch first 5 rows only", myCon);

Atomic Event: not applicable

10 11 12 13

infile = for (int row = 0; row < result.get_size(); row++) { OS.write_File(result.get_row(row).get_column("stmt_text"), infile) }

14 15

if workload = "OLTP" { exec_time = 1; advise_type = " I";} if workload = "OLAP" { exec_time = 5; advise_type = " IM";}

16 17 18

outfile = DB2_SYS.DB2ADVIS(db_Name, exec_time, advise_type, infile, outfile); DB2_CLP.DB2_BATCH(outfile, db_Name);

Complex Event: TIMER(30 min) Tuning Plan: Every 30 minutes the tuning plan (Listing 2) determines the top 10 statements based on execution time and the number of executions. In contrast to the previous tuning plan, the write_File step materializes the whole SQL output in a file at once. Subsequently, the DBA gets notified about the location of the file containing input for his further problem diagnosis via console and log. We intend to provide notification functionality via e-mail as well.

Listing 1: Tuning Plan – Index Design

Based on the ratio "rows read/rows selected" 11 (see code lines 4-9) the tuning plan determines the top-5 SQL statements 12 . Thereby, it ignores all non-select statements as well as all "SYS%" tables and views, since they cannot be indexed. The determined top-5 statements are saved in a file (see code lines 10-13) that is provided as input for the DB2 Design Advisor [15], as seen in line 17. Additionally, further information could be passed to the advisor, e.g. the schema for unqualified tables and simulation catalog tables. Subsequently, the indices suggested by the Design Advisor get automatically created (line 18) by executing the DB2 DDL tool-generated batch script. The latter also contains runstats statements guaranteeing the statistics being updated afterwards. Depending on the workload type, the Design Advisor is called with different parameters (see lines 14 and 15). For OLTP workloads, advise time is set to its minimum and only indices will be recommended. For OLAP workloads, it seems useful for the advisor to have more time for index calculations and to recommend materialized views as well. The execution of this tuning plan is expected to drastically reduce the ratio "rows read/rows selected" by using newly created indices efficiently and thereby increasing performance. It is planned for the future to include insert, update, and delete statements in the input of the Advisor in order to improve the quality of tuning.

db_Name = ATE.get_Param_Value("db_name"); myCon = DB2_SQL.Connect_DB("DB2PE"); Table result = DB2_SQL.Mon_SQL( "SELECT avg(total_exec_time) * max(num_executions), stmt_text from db2pm_9.dynsql where DB_NAME=' "+db_Name+" ' AND UCASE(stmt_text) not like '%SYS%' AND UCASE(stmt_text) like 'SELECT%' AND interval_to >= (current timestamp - 30 minutes) group by stmt_text order by 1 desc fetch first 10 rows only"); result_file = OS.write_File(result, result_file) DBA.notify_DBA("console+log"," TopN statements have been determined. Please refer to " + result_file + " for further diagnosis" );

Listing 2: Tuning Plan – Long Running Queries

5.3 Bufferpool Tuning Atomic Event: BPHR13[

Suggest Documents