ize the performance and Quality of Service (QoS) of such appli- cations. Manual ... pre-production testing and verifying of applications) has become an important ...
Mulini: An Automated Staging Framework for QoS of Distributed Multi-Tier Applications Gueyoung Jung
Calton Pu
Galen Swint
CERCS, College of Computing Georgia Institute of Technology Klaus Advanced Computing, 266 Ferst Drive, Atlanta, GA 30332-0765
{gueyoung.jung, calton, galen.swint}@cc.gatech.edu
ABSTRACT The increasing scale and success of distributed multi-tier applications have created increasingly dynamic workload variations that made system performance less predictable. Consequently, staging has become a significant and useful method to characterize the performance and Quality of Service (QoS) of such applications. Manual staging is an expensive, time consuming and error-prone process. In particular, manually exploring a large configuration parameter space of the applications is a cumbersome task. In this article, we outline the design of Mulini, an automated staging framework for large-scale multi-tier applications that realizes the automation via an extensible and flexible code generator. Mulini adopts XSLT/XPath tools and aspectoriented programming (AOP) techniques to manipulate XMLencoded high-level specifications and weave non-functional specifications (e.g., QoS) into staging implementation. To illustrate the usability of the Mulini code generator in complex staging, we apply Mulini to bottleneck detection and observation-based performance characterization of the RUBiS eCommerce benchmark.
Categories and Subject Descriptors D.2.11 [Software Engineering]: Software Architectures – languages (e.g., description, interconnection, definition), domainspecific architectures
General Terms Performance, Languages.
Keywords Mulini, staging, code generation, multi-tier application, QoS.
1. INTRODUCTION In the lifecycle of enterprise-scale applications, staging (i.e., pre-production testing and verifying of applications) has become an important phase. Staging can test Quality of Service (QoS) to a certain level of assurance, particularly those QoS parameters associated with sophisticated non-functional specifications (e.g., Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. ASE Workshop on Automating Service Quality, November 2007, Atlanta, Georgia, USA © 2007 ACM ISBN: 978-1-59593-878-7 /07/11...$5.00.
security, reliability, availability, performance, etc.). As the complexity of applications increases, carrying out staging early and frequently has become an increasingly expensive and difficult task when done manually. For instance, these applications typically consist of multiple tiers, where each tier embodies various hardware, software servers, and management tools with a number of configuration parameters. Typically, these applications run over heterogeneous hardware and operating systems. Thus, the configuration parameter space can grow very large as the number of tier increases. In addition, service providers want to use staging to evaluate system responses to dynamic environmental changes such as workload variations (from normal workload to flash crowd) and software/hardware upgrades. In addition to the challenges of heterogeneous platforms and large configuration spaces to explore, mapping high-level specifications into implementable code is also a significant challenge in staging. Resource and requirement specifications typically come from various domains with their own syntaxes, and some of them are regularly updated during the lifecycle. Consequently, increasing application complexity limits the effectiveness of manual staging and creates higher risks in the deployment of large-scale mission-critical distributed applications. For these reasons, the staging process must be automated, extensible and flexible to effectively handle dynamic environments. Our fundamental goal is to make staging worthwhile for guaranteeing QoS through lowering the barrier to early and frequent testing applications. In our current project, we focus on three key ways to add value of staging to the large distributed multitier application development process: • Performance verification – an application, especially a web service, may be bound by performance goals motivated by internal policies or external policies agreed upon with a customer. Realizing when the application performance will fail to meet QoS goals (i.e., when the application enters into a region where it does not meet performance goals) is also valuable information. As with functional verification, this type of testing may require multiple testing cycles, possibly over multiple deployment configurations (e.g., a variable number of replications and appropriate resource allocations for application tier servers). • Deployment workflow verification – application deployment is a non-trivial problem, as described in [10][13]. Staging provides early verification of whether an application’s deployment workflow is correct. • Historical data generation – application administration and resource assignment can benefit from the historical data of resource usage [11]. When an application is new or sub-
source code using XPath). We omit XML+XSLT-based code generation technique here due to space limitation. For more detail on the technique, please refer to [12]. The high-level code generation process adopted for Mulini matches the compiler approach of multiple steps of code transformations. Mulini transforms input specification files (XMLencoded) into an intermediate XML-encoded specification with annotations at the first transformation step. During the rest of transformation steps, modules (aspects) are woven into the intermediate specification using the annotations. The important advantages of the Mulini code generation process over compilation stem from its extensibility and flexibility by utilizing XML and XML manipulation standard languages (i.e., XPath, XSLT, XQuery, and XQueryUpdate). With these languages and tools, we can effectively decouple code generation process itself from integration of various input specifications and output implementations (extensibility) as well as evolutions of specification syntaxes (flexibility). Mulini can take multiple high-level specifications as input including UML software design, various XML-encoded standards, texts, etc., and simultaneously generate multiple outputs including deployment scripts, workload driver, monitors, wrapper code of analyzer and application itself, and compile scripts as various language formats (e.g., C, C++, Java, scripts, etc.). It easily integrates various off-the-shelf management tools by encapsulating the tools with XSLT instructions. Using these instructions, we define input/output types and references to extract values from source XML. Furthermore, Mulini can also customize application code, for instance, retargeting server identifiers from production values to pre-production values by weaving aspects into specifications and implementations from aspects repository. The evolution of specifications is also taken into account in our code generation architecture. This evolution frequently occurs through the application lifecycle by adding new features into application systems corresponding to software/hardware upgrades and inserting/removing functionalities of applications. They are easily reflected in Mulini with a little or no change of its code generation process by adding (or modifying) custom XML tags into specifications and then (re)mapping the tags into the rest of code generation process in Mulini. Consequently, the extensibility and flexibility of Mulini enables us to resolve dynamic environment issues of enterprise-scale applications (i.e., heterogeneity and dynamic changes of software and hardware). Mulini also centralizes configuration of the target application system into a single intermediate specification. Thus, it allows us to easily change configuration parameters by simply locating target configuration parameters in the single
stantially altered, staging affords an opportunity to create this data as baseline information to generate policies being applied to production environment (on-line). As with performance verification, historical data is acquired after many iterations of staging have been performed. To effectively conduct the above key goals in the dynamic environments, we developed the Mulini code generator, which automates the staging process with extensibility and flexibility. The underlying architecture of Mulini is based on the Clearwater approach, which demonstrated extensibility, flexibility, and modularity with respect to easily implementing new features in the code generation and instrumentation process [12]. In the context of the technical facet of Clearwater, Mulini also utilizes XML and XSLT to generate code along with aspect oriented programming (AOP) technique [5] to weave aspects into specifications and implementation by crosscutting them with annotations. Thus, Mulini inherently supports staging with extensible and flexible mechanisms. Currently, a prototype of Mulini is actively used in the Elba project, which focuses on automated system management through bottleneck detection, performance characterization and (re)configuration/(re)design of enterprisescale applications and web services [4]. The remainder of this article is organized as follows: Section 2 presents some important features of Mulini followed by its architecture and code generation process. Section 3 outlines our three on-going projects of Elba where Mulini is effectively applied to. Section 4 introduces some related work and then discusses planned future work and conclusion in Section 5.
2. CODE GENERATION PROCESS 2.1 Extensibility, Flexibility, Automation Components in the Mulini architecture (illustrated in Figure 1) communicate by exchanging XML documents. XML provided several important advantages for Mulini. XML is designed with extensibility and flexibility in mind, and our own tool-building and integration approaches to the code generation leverage these qualities. Furthermore, XML is straightforward to parse, and therefore, amenable to many types tools that accept of textbased data. XML is also the format of choice for many of the existing tools which Mulini comprises: the policy formalization tool, Cauldron, the model-driven application design tool, Ptolemy II, and deployment engine, Smartfrog. To manipulate XML tree structure, we employ XPath to locate elements (attributes) of XML tree or its sub-trees. We also use XSLT to control code generation by wrapping source code being generated with XSLT control instructions (e.g., copy, branch, loop, declarations of variables, and insert/replace target values of source XML into Cauldron
Ptolemy
XSLT
XMOF XMOF
XTBL XTBL
WSLA WSLA
Spec. Weaver
XSLT/aspects repository
XTBL+ XTBL+
Deployment Deployment Scripts Scripts
ACCT Generator
Code Generator
XTBL+ XTBL+
XSLT/aspects repository
Code Weaver
Instrumented Instrumented Applications Applications Workload Workload Drivers Drivers Analyzer/ Analyzer/ Monitors Monitors
Smartflog
Classifier
Figure 1. Mulini code generator architecture with four tools -the policy formalization tool (Cauldron [11]), the model-driven application design tool (Ptolemy II [16]), the deployment engine (Smartfrog [17]), and the decision tree classifier (Weka [18]).
intermediate specification using a fairly straightforward XPath statement, rather than hunt for all configuration parameters scattered across and throughout the application. Once a configuration parameter is changed, the related portions (XML elements and their sub-trees) of a series of intermediate specifications and target implementations are automatically changed by mapping the elements of specification to those of intermediate specification and implementations. 0.1 XML based CIMMOF meta-model describing resources... ./meta-model/TPCW-XMOF.xml 0.1 XML based SLA describing performance metrics... ./sla/TPCW-WSML.xml 0.1 TPCW benchmark shopping model ./flow-model/TPCWShopping.xml ... ... ... http://hera.cc.gatech.edu:8000/tpcw
XTBL
WorkloadGenerator 100 100000 30 WorkloadGenerator ... ...
WSLA
main specific test bed language (XTBL), which captures staging-specific information including the locations of aspects being woven, a list of input specifications, output implementations, and their mapping and location information. From these highlevel specifications, the specification weaver incorporates necessary information into an intermediate XML document, referred to as XTBL+ as illustrated in Figure 2. ACCT (Automated Composable Code Translator), the second component of Mulini, extracts the deployment information from XTBL+. It shows a XMOF list of pair-wise deployment dependencies between deployment activities. For instance,
... awing14 2.8GHz ... ... WorkloadGenerator_LS1 ... ... WorkloadGenerator-1_Install WorkloadGenerator-1_Ignition awing14.cc.gatech.edu//start_client ... Finish-Start QM_Activity.Id=WorkloadGenerator-1_Install QM_Activity.Id=WorkloadGenerator-1_Configuration ...
ws_installation, and ws_configuration, dbs_configuration, where FS
and SS notations mean “start the second activity after finish the first activity” and “start both activities at the same time”, respectively. Using this list, this component composes a global deployment dependency (for additional detail, see [10]). It, finally, transforms the XTBL+ into scripts as input to a deployment engine (Smartfrog). Meanwhile, the code generator component, the third component of Mulini, generates baseline staging code such as the synthetic workload generator, monitors, SLO-evaluator, and input/output configurations of tools being used at staging. It first extracts necessary information from XTBL+ and then executes XTBL+ the proper XSLTs. The generated code at this step is encapsulated within custom XML tags for reference by additional XSLT statements during the rest of code generation process. For an example of custom XML tags,