Said Elnaffar ⢠Pat Martin· Berni Schiefer·. Sam Lightstone. Received: 27 December 2004 I Reviscd: 25 November 2006 / Accepted: ,15 December 2006 i.
J Intell Inr Syst (2008) 30:249-271 001 10.1 007/s I 0844-006-0036-6
Is it DSS or OLTP: automatically identifying DBMS workloads Said Elnaffar • Pat Martin· Berni Schiefer· Sam Lightstone
Received: 27 December 2004 I Reviscd: 25 November 2006 / Accepted: ,15 December 2006 i Publi ~ hcd online : 14 February 2007 fI} Springer Science + Business Media. LLC 2007
Abstract The type of the workload on a database management system (DBMS) is a key consideration in tuning the system. A[locations for resources such as main memory can be very different depending on whether the workload type is Online Transaction Processing (OLTP) or Decision Support System (DSS). A DBMS also typically experiences changes in the type of workload it handles during its normal processing cycle. Database administrators must therefore recognize the significant shifts of workload type that demand reconfiguring the system in order to maintain acceptable levels of performance. We envision intelligent, autonomic DBMSs that have the capability to manage their own pertormance by automatically recognizing the workload type and then reconfiguring their resources accordingly. In this paper, we present an approach to automatically identifying a DBMS workload as either OLTP or DSS . Using data mining techniques, we build a classification model based on the most significant workload characteristics that ditferentiate OLTP from DSS and then use the model to identify any change in the workload type. We construct and compare classifiers built from two difrerent sets of workloads, namely the TPC-C and TPC-H benchmarks and the Browsing and Ordering profiles fTom the TPC-W benchmark. We demonstrate the feasibility and success ofthese classifiers with TPC-generated workloads and with indush)'-Supplied workloads_
s. Elnaffar (E
) ollege of IT. UAE University, AI-Ain, 17555, Ui\E c-mail: elnafTa lUuacu•.ac.ae
P. Martin School of Computing. Queen's University, Kingston . ON K7L 3N6. Canada e-mail : martincs .queensu.ca B. Schiefer' S. Lightstone IBM Canada, IBM Softwar.: Lab, Markh am, ON L6G l e 7, Canada B. Schiefer e-mail: schiefeca.ibm.com S. Lightstone c-mail: light( ca.ibm .com
~ Springer
J lntell lnf Syst (2008) 30:249--271
250
Keywords Autonomic systems· Autonomic DBMS· Data mining· Classification·
Workload characterization · Decision Support System · On-line Transaction Processing·
DSS · OLTp· Self-managed DBMS· Database Management Systems ·
Performance tuning · Applied artificial inte11:igence
I Introduction Commercial relational database management systems (DBMSs) are used for a range of applications from data warehousing through online transaction processing. As a result of this demand, DBMSs have continued to grow in terms of their size and the functionality they provide . This growth in tum creates additional overhead in associated system complexity and administration costs (lightstone, Schiefer, ZiUio, & Kleewein, 2003). The complexity of systems is increasing the component of total system cost that is attributable to human-related costs (Brown Associates, 2000) while at the same time increasing the likelihood of administrative error (Fox & Patterson, 2003). Autonomic computing systems are proposed as a solution to this complexity crisis. An autonomic computing system has the following properties (Ganek & Corbi, 2003): The system is The system is conditions.
The system is The system is The system is
aware of itself and able to act accordingly.
able to configure and reconfigure itself under varying and unpredictable
able to recover from events tbat cause it to malfunction.
able to anticipate optimized resources needed to perfonn a task.
able to protect itself.
In this paper we consider the general problem of providing autonomic DBMSs with the ability to dynamically reconfigure or tune themselves. In order to tune a system we must understand the workload placed on that system so automatic workload characterization is a prerequisite for automatic tuning. We focus on the specific workload characterization problem of identifying the type of a workload. Database administrators (DBAs) tune a DBMS based on their knowledge of the system and its workload. The type of the workload, specifically whether it is Online Transactional Processing (OLT.?) or Decision Support System (DSS), is an important criterion for tuning (IBM Corp., 2000; Oracle Corp., 200Ib). Memory resources, for example, are allocated very differently for OLTP and DSS workloads. OLTP workloads contain a large number of small transactions that involve retrievals of individual records based on key values and updates. There are relatively few sorts and joins. DBAs therefore typically allocate memory to areas such as the buffer pools and 'log buffers while minimizing the sort heap. DSS workloads, on the other hand, contain a small number of large transactions that involve scans, joins and sorts. There are very few, if any, updates. DBAs therefore typically allocate more memory to the sort heap and the buffer pools. The problem of correctly identifying the type of a workload on a DBMS is compounded by the fact that the type of a workload tends to vary over time. Many organizations experience daily, weekly or mOt:lthly processing cycles that mirror the cycle of tasks performed by the organization . A bank, for example, may experience an OLTP workload during normal business hours, a DSS-like workload overnight while the previous day's business is analyzed, and a heavy DSS workload at the end of the month when financial reports and executive summaries are issued. DBAs must therefore recognize the significant ~ Springer
J Intell Inf Sy5t (2008) 30:249-271
251
shifts in the workload and reconfigure the system in order to maintain acceptable levels of perfomlance. The problem of developing an approach to automatically identify the type of a workload is challenging for the following reasons: There are no rigorous, formal definitions of what makes a workload DSS or OlTP. We currently have only general rules such as: Complex queries are more prevalent in DSS workloads than in OlTP workloads. A DSS workload has fewer concurrent users accessing the system than does an OlTP workload. An intelligent computing solution requires that the DBMS identifY the workload type using only information avai'lable from the DBMS itself or from the operating system (OS). No human intervention should be involved. A solution must be online and inexpensive, which entails adopting lighhveight monitoring and analysis tools to reduce system perturbation and minimize performance degradation. A solution must be tolerant of changes in the system settings and in the DBMS configuration parameters. A solution should assess the degree to which a workload is DSS or OlTP, that is, the concentration of each type in the mix. Any subsequent performance tuning procedure should be a function of this assessment. The main conlribution of Ihe paper is a solution to the problem of automatically identifYing the type of a workload that meets the above challenges. Our solution treats workload type identification as a data mining classification problem, in which DSS and OLT? are the class labels, and the data objects c1assitied are database perfom13nce sllapshots. We first construct a workload model, or a workload c1assijier, by training the c1assiticalion algorithm on sample OlTP and DSS workloads. We then use Ihe workload classifier to identifY snapshot samples drawn from unknown workload mixes. The classifier scores the snapshots by tagging them by one of the class labels, DSS or OlTP. The number of DSS- and OlTP-tagged snapshots reflects the concentration (relative proportions) of each type in the mix. Figure 1 conceptually portraits how a workload c1assi.fier can be integrated with a DBMS system. Based on the SQl statements that the DBMS processes, the workload c'lassifier obta l n~ and analyzes performance measures, in the form of Fig. I The appropriate tuning policy i, adopted upon id'entity ing tllt\ type of the workload by the c1ass.ifier
SQL
Snapshots
DSSne~s
Index:
Tuning and Oplimization Rules
RelSellillg Configllralion Paramel('T.\
~ Springer
252
J Intell Inf Syst (2008) 30:249-271
snapshots, from the DBMS. The workload classifier reports the type of the workload, via the DSSness index as explained later, to the tuning and optimization modules that take care of adjusting the DBMS configuration parameters in light of the identified workload type. In building the workload classifier we first had to address the issues of what makes a workload OLTP or DSS and what data do we use to make the distinction. The decision process can involve static data, such as query SQL and access plans, dynamic data, such as performance data collecled by system monitors, or both. We chose to base our analysis on dynamic data collected by DBMS performance monitors because the dynamic data includes frequency information that is not available from static sources, captures the variability of the workload over time, and is easier to analyze than SQL or access plans . We analyze attributes available from performance snapshots with respect to their suitability for difTerentiating between DSS and OLTP and derive a set of attributes to be used in our classification process . We construct and evaluate several c1'assifiers in the paper. One classifier, which we call Classijier(C, H), is built using OLTP and DSS training data from the TPC-C™ (TPC, 2001a) and TPC-HTM (TPC, 1999b) benchmarks, respectively. As an OLTP system benchmark, TPC-C simulates a complete environment where a population of terminal operators executes transactions against a database. The benchmark is centered around Ihe principal activities (transactions) of an order-entry environment. These transactions include entering and delivering orders, recording payments, checking the status of orders, and monitoring the level of stock at the warehouses . While the benchmark portrays the activity of a wholesale supplier, TPC-C is not limited to the activity of any particular business segment, but, rather, represents any industry that must manage, sell, or distribute a product or service. On the other hand, TPC-H is a decision support benchmark. It consists of a suite of business oriented ad-hoc queries and concurrent data modifications. The queries and the data populating the database have been chosen to have broad industry-wide relevance. This benchmark illustrates decision support systems that examine large volumes of data, execute queries with a high degree of complexity, and give answers to critical business questions. A second classifier, which we call Classifier(O, B), is built using OLTP and DSS training data from the Ordering and Browsing profiles of the TPC-WTM (TPC, 200 I b) benchmark, respectively. TPC-W comprises a set of basic operations designed to exercise transactional web system functionality in a manner representative of internet commerce application environments. These basic operations have been given a real-life context, portraying the activity of a web site (bookstore) that supports user browsing, searching and online ordering activity. Visitor activities are described by three profiles: Browsing, Shopping, and Ordering. The Browsing profile is characterized by extensive browsing and searching activities. The Shopping profile exhibits some product ordering activities but browsing is stil'l dominant. The Ordering profile has a majority of ordering activilies. Therefore, the ultimate difference between these profiles is the browse-to-order ratio. We also consider two generic classifiers that are built from a range of flavours of OLTP and DSS workloads, We hypothesize that such classifiers will be able to recognize a wider range of workloads than the ones built from more specialized training workloads. The first generic classifier, which we call the hybrid classifier (HC), is constructed by training it on a mix of the TPC-H and the Browsing profile workloads as a DSS sample, and a mix of the TPC-C and the Ordering profile workloads as an OLTP sample. The second generic classifier, which we call the graduated-hybrid classifier (GHC) , is built using different intensities of OLTP and DSS as shown in Fig. 2. We consider TPC-H to be a Heavy DSS (HD) workload, the Browsing profile to be a Light DSS (LH) workload, TPC-C to be a Heavy OLTP (HO) workload and the Ordering profile to be a Light OLTP (LO) workload.
'!d Springer
J Intell lnf Syst (2008) 30:249-27 I
253
Fig. 2 DifTcrent shades of DSS and OLTP
In other words, GHC attempts to qualitatively analyze the di m rent aspects of the DSS and OLTP elements in the workload by reporting the conccntration of each workload shade compnsing each type. Besides having practical advantages, GHC demonstrates that our approach can be applied 10 workloads of morc than two types of work . We validate our approach experimentally with workloads generated from Transaction Processing Performance Council (TPC) benchmarks 1 and with real workloads provided by three major global banking firms . These workloads are run on DB2® Universal Database™ Version 7.2 (IBM Corp., 2000). The rest of this paper is organized as follows . Section 2 reviews related work. Section 3 first describes our approach to the problem and the selection criteria for the snapshot attributes. It then explains our methodology and how to compose the snapshots. Section 4 describes our different sets of experimcnts with the four classifiers, and discusses the results obtained from experimenting with the benchmark and industry-supplied workloads. Section 5 presents our conclusions and guidelines for future work.
2 Related work
Our research, as we indicated earlier, contributes to the development of autonomic DBMSs, which are intelligent enough to know themselves and the surrounding environment in order to regulate their operations. DBMS vendors have accepted the concept of an autonomic DBMS as a viable approach to simplifying the task of managing their products and now include self-tuning features in several areas of their products. In query optimization, for example, somc self-tuning features include automatic collection of new statistics and the construction of histograms (Chaudhuri & Narasayya, 2000; Oracle Corp .• 2002), sell: validation of the cardinality model (Stillger, Lohman, Markl, & Kandil, 2001) and dynamic adjustment of the query execution strategy. Exampl.es of self-tuning features for configuration management include wizards to select indexes (Lohman, Valentin, Zilio, Zuliani, & Skelly, 2000; Microsoft Corp ., 2002; Oracle Corp., 200 Ia) and materialized views (Agrawal, Chaudhun, & Narasayya. 200 I; Oracle Corp., 200 Ic). Examples of self tuning features for maintenance and monitoring include tools, such as DB2 UDB's Health Center (IBM Corp., 2003) and Oracle's Manager Console (Oracle Corp., 200 Ia), which indicate potential performance problems and facilitate the collection and analysis of performance data, and tools that schedule necessary maintenance utilities . To the best of our knowledge, there is no previous publ ished work examining the problem of automatically identifying the type of a DBMS workload. There are, however, numerous studies characterizing database workloads based on different properties that can be exploited in tuning DBMSs (Elnaffar & Martin, 2002). Several studies use clustering to obtain classes of transactions grouped according to their consumption of system resources or according to the reference patterns in order to tune the system (Yu & Dan, 1994) or to J Note Ihal since our TPC benchmark setups have not been audiled per TPC specificalions, OUf benchmark workloads should only be referred to as TPC-like wo rkloads. When the terms TPC-C. TPC-H. and TPC-W are used to refer 10 our benchmark workload , il should be taken to mean TPC-C-, TPC-H-, and TPC-W-likc, respecti\'ely.
~ Springer
254
J Intell Inf Syst (2008) 30:249- 271
balance the workload (Nikolaou et aI., 1998). Some studies focus on characterizing the database access pallems to predict the buffer hit ratio (Dan, Yu, & Chung, 1995) and the user access behavior (Sapia, 2000). Other studies characterize DBMS workloads on different computer architectures in order to diagnose performance degradation problems (Ailamaki, DeWitt, Hill, & Wood, 1999) and to characterize the memory system behavior of the OLTP and DSS workloads (Barroso. Gharachorloo, & Bugnion. 1998). Hsu. Smith, and Young (2001) systematically analyze the workload characteristics of TPC-CT:\I (TPC, 2001a) and TPC-DTM (TPC, 1999a) workloads, especially in relation to those of real production database workloads. This study shows that the production workloads exhibit a wide range of behavior, and in general, the two benchmarks complement each other in reflecting the characteristics of the production workloads.
3 Approach We view the problem of classifying DBMS workloads as a machine-learning problem in which the DBMS must learn how to recognize the type of the workload mix. The workload itset:f contains valuable information about its characteristics that can be extracted and analyzed using data mining tools. Our approach is to therefore use data mining classification techniques, specifically Decision Trees Induction (Murthy, 1998), to build a classification model. One of the advantages of using decision tree induction is its high interpretability, that is, the ease of extracting the classification rules and the ability to understand and justify the results, in comparison with other techniques such as neural networks. 3. 1 Overview Classification is a two-step process. In the first step. we build a model (or classifier) to describe a predetermined set of data classes. The model is constructed by analyzing a training set of data objects. Each object is described by attributes, including a class label attribute that identifies the class of the object. The learned model is represented in the form of a decision tree embodying the rules that can be used to categorize future data objects. In the second step, we use the model for classification. First. the predictive accuracy of the classifier is estimated using a test data set. If the accuracy is considered acceptable (for example, reporting that 80%, or more, of the tested snapshots are classified as DSS or OLTP when we allempt to identify a DSS -or OLTP-deemed workload), the model can be used to classify other sets of data objects for which the class label is unknown . For our problem, we define the DSS and OLTP workload types to be the two predefined data class labels. The data objects used to build the classifier are performance snapshots taken during the execution of a training database workload. Each snapshot reflects the workload behavior (or characteristics) at some time during the execution and is labeled as being either OLTP or DSS. We tried building our classitiers using two methods. We used SPRINT (Shafer, Agrawal, & Mehta, 1996), a fast scalable decision-tree based algorithm, and neural network (NN) classification using feed-forward network architecture and a back-propagation learning algorithm. Both methods are implemented in IBM DB21!t) Intelligent Miner™ Version 6.1 (ffiM Corp., 1999), which we used to design and configure our classifiers. We found that the decision tree classification method is beller for our problem than the neural network method for several reasons. First. the decision tree method, as expected, is easier to use and to set up than the neural networks method. Second, it is easier to interpret and explain the results from the decision tree method. Third, the decision tree method ~ Springer
J Intell lof Syst (2008) 30:249- ;!71
255
provides the ability to assign weights to the attributes that renect the importance of the attributes to the decision process. Finally, the decision tree method achieved a higher accuracy in tests than the neural network algorithm. Table 3 describes the settings we used in the decision tree algorithm that we adopted in implementing our methodology 3.2 Snapshot attributes The data objects needed to build the classifier are perfonnance snapshots taken during the execution of the database workl'o ad . Each snapshot reflects the workload behavior at some time during the execution . We use the following criteria to select snapshot attributes: I. Relevance. Select attributes that playa role in distinguishing between DSS and OLTP mixes. 2. Accessibility.from the System. Select attributes that are readily and inexpensively obtainable from the DBMS or the operating system at run time. 3. Low System-Dependence . Select attributes that are less sensitive to changes in the system settings or to DBMS configuration parameter changes. System settings include operating system resource allocations, such as memory and CPUs, and the database schema. DBMS configuration parameters include buffer pool sizes, sort heap size, isolation level, and the number of locks. In classification studies, relevant attributes get usually proposed by the domain experts. Based on our academic and industrial experience in the field, we first considered the following list as relevant attributes for the workload snapshots :
1. Queries Ratio (%): The ratio of SELECT statements versus Update/lnsertlDelete (UID) statements, which is usually higher in DSS than OLTP. 2. Pages Read : DSS transactions usually access larger portions of the database than OLTP transactions. 3. Rows Selected: Although a DSS query tends to summarize infonnation, it may still return more rows than an OLTP one. 4. Throughput: The number of SQL statements executed during the snapshot is expected 95%) of Finll-3.
4.4 Constructing generic classifiers Notwithstanding the success of Classijier(O, B), the above results obtained from assessing the genericness of the two classifiers lead us to believe that a classifier trained on a particular workload should not be expected to be universally good at idcntifying other workloads, especially if the other workloads have different characteristics. Every workload is a mix of its own set of SQL statements with their own characteristics. Neverthelcss, we argue that we can construct a more generic classifier, a hybrid classifier, by training it on different flavors of DSS and OLTP mixes in order to dcrive more generic rules that can recognize a wider range of workloads . Sueh training should empower the prediction Fig. II Classijier(C, H) and C/assi[ier(O, B) identifying
TPC-C and TPC-H workloads 1/1
own indexes. In Proceedillgs 0f'16th IEEE " Oil erellC!? Oil dala engineering. San Diego. CA . Microsoft Corp. (2002). Microsoft SQL Sn-ver 2000 DocIIIIJ('nlaliof/ , Murthy, S. (1998). Automatic construction of decision tlTes from data: A multi-disciplinary survey. Dala Millillg and Knowledge Disl'O velJ ~ 2. 345-389,
~ Springer
J Intell Inf Syst (2008) 30:249-271
271
Nikolaou, c., Labrinidis, A., Bohn, V, Ferguson, D., Artavanis, M., Kloukinas, c., et al. (1998). The impact of workload clustering on transaclion routing. Technical Report FORTH-ICS TR-238. Oracle Corp. (200 I a). Oracle 9i Mallageability Features. All Oracle White Paper. Retrieved from http:// www.oracle.comlip/deploy/dalabaseloracle9i/collatcralfma _ bwp I 0 .pd f. Oracle Corp. (200 I b). Oracle9iDatabase Pe/jormance Guide alld Re/erence, Release 1(9.0.1), Part# A87503-02. Oracle Corp. (200 Ic). Oracle 9i Materialized Views. An Oracle While Paper. Relrieved from http://technel. oracl e.comiproducLSloracle9 ifpd f/09i_ mv. pd f. Oracle Corp. (2002). Query Optimization in Oracle 9i. An Oracle White Paper. Retrieved from http ://lechnel. orac le.comfproductslbifpd fJ09i _optimization _ twp .pd f. Sapia, C. (2000). PROMISE: Predicting query behavior 10 enable predictive caching strategies for OLAP systems. In Proceedings af the second internatianal conference all data warehousing and k110wledge discovery (pp. 224-233). Shafer, 1., Agrawal, R., & Mehla, M. (1996). SPRINT: A scalable parallel classifier for data mining. In Proceedings o/the 22th illtemational conjere/lce 011 l'C"Y large databases. Mumbai (Bombay), India. Stillger, M., Lohman, G., Markl, V, & Kandil, M. (2001). LEO--DB2's LEarning oplimizer. In Proceedings 0/27th VLDB conference (pp. 19-28). Rome, Italy. TPC (1999a). Benchmark D Standard Specijication Revision 2.1, Transaclion Processing Performance Council. TPC (1999b). Benchmark H Standard Specijh'ulion Revision 1.3.0, Transaclion Processing Performance Council. TPC (200Ia). Benchmark C Standard Specijication Revision 5.0, Transaction Processing Performance Council. TPC (200 I b). Benchmark W (Web Commerce) Standard Specificalion Revision 1.7, Transaction Processing Performance Council. Yu, P., & Dan, A. (1994). Performance analysis of affinity clustering on transaction processing coupling archilecture. IEEE Transactions on Knowledge and Data Engineering. 6(5),764-786.
Trademarks The following lerms are trademarks or regislered trademarks of Intemalional Business Machines Corporal ion in the Uniled Slales, other countries, or bOlh: DB2, IBM, Intelligenl Miner, Universal Dalabase. Microsoft and Windows are registered trademarks of Microsoft Corporal ion in Ihe Uniled Slates, olher countries, or bOlh. Other company, product or service names may be trademarks or service marks of others.
~ Springer
1111111111111111111111111111111111111111111111111111111
I