This paper is a preprint of a paper accepted by IEE Proc. Software and is subject to Institution of Engineering and Technology Copyright. When the final version is published, the copy of record will be available at IET Digital Library
NON-INTRUSIVE END TO END RUN-TIME PATH TRACING FOR J2EE SYSTEMS Trevor Parsons Performance Engineering Laboratory University College Dublin Dublin 4, Ireland Email:
[email protected]
Adrian Mos INRIA Rhone Alpes Montbonnot 38334, France Email:
[email protected]
John Murphy Performance Engineering Laboratory University College Dublin Dublin 4, Ireland Email:
[email protected]
Abstract We introduce an end to end run-time path tracing approach for J2EE systems. The approach is non-intrusive and thus does not require instrumentation of middleware or application source code. An implementation of our system has been realised in the COMPAS Java End to End Monitoring tool which extends and integrates a number of open source projects. Our results give details on the performance overhead associated with our tool. We present further results that show the portability of our approach by applying it to a number of different application server implementations. Finally we also show that run-time paths collected by our implementation can be used to easily reason about the overall system structure and design of complex enterprise applications.
1. Introduction Enterprise development is moving away from large monolithic systems and simple client/server configurations to highly interconnected, multi-tier, distributed architectures designed to run on a heterogeneous collection of servers. Typical J2EE systems, for example, consist of a four tier architecture [1]. Each tier can itself be made up of a number of different software components that interact to perform enduser transactions. Figure 1. shows a typical J2EE architecture which includes a web, application and database server. In such environments each of the different servers that make up the system generally produce multiple logs used for system monitoring. While this information can be useful for assessing the performance of individual servers, it can prove difficult to piece the different logs together to form a coherent picture of the entire application. Debugging performance problems in such circumstances can be problematic, as developers are required to sift through and correlate a range of different log files in an attempt to understand exactly how the different components in the system interact. As a result, often developers do not have a good understanding of the entire application. This lack of understanding can lead to badly designed systems that perform poorly, and which in turn can also be difficult to debug and maintain. A run-time path [2] contains the control flow (i.e. the ordered sequence of methods called required to service a user request), resources and performance characteristics associated with servicing a request. By analysing these paths one can easily see how system resources are being used, how the different components in the system interact and how user requests traverse through the different tiers that make up the system. Run-time path tracing can be used to collect run-time paths which can be analysed to help developers and performance engineers understand the overall system structure of complex distributed applications [2]. More advanced path analysis can also be performed to unlock further information that can help developers to understand their systems. For example, data mining algorithms have been used to analyse run-time path data for the purpose of detecting software antipatterns [3]. Chen et. al [4] have also analysed run-time paths using data clustering and statistical techniques for problem determination. There are currently a number of end-to-end monitoring tools that have the ability to trace systemwide paths through a system [4] [5] [6]. However, a major draw-back of the current tools is that they are intrusive i.e. they require changes to the application source code or the server implementation. In many situations, however, it may not be possible to make such changes, especially in situations where system components are from multiple vendors or located in places where path monitoring code cannot be inserted because of security, licensing or other technical constraints. In this paper we introduce a non-intrusive end-to-end run-time path tracing tool, COMPAS Java End to End Monitoring (JEEM), that monitors the paths user requests take when traversing through the
different tiers in the system. Resource usage and performance characteristics of each request can also be obtained using our approach. It builds upon the open source COMPAS monitoring framework [7] [8] [9] and leverages and extends the capabilities of a number of other open source projects [10] [11]. The remainder of the paper is structured as follows: Section 2 gives our motivation and the considerations that should be undertaken (independent of the underlying technology) for non-intrusive run-time path tracing. Section 3 gives an overview of J2EE, the component framework that our run-time path tracing prototype has been implemented for. Since our approach is non-intrusive it leverages the actual J2EE technology itself, monitoring J2EE systems using standard middleware mechanisms. This section introduces a number of the standard middleware mechanisms that our approach takes advantage of to achieve its aims. Section 4 presents the core instrumentation process and the extension capabilities available in the COMPAS monitoring framework. The end-to-end run-time path tracing approach leverages these extension mechanisms to build upon the COMPAS monitoring and instrumentation infrastructure. In section 5 we describe the extensions made to the COMPAS framework and the overall architecture of COMPAS JEEM. In this section we also detail the open source projects that we have extended and integrated with COMPAS to achieve our aims. Section 6 gives the results we obtained when we applied our tool to a number of J2EE applications. Here we show how run-time paths can be used to help developers reason about their overall system design. We also show that our implementation is portable and that we successfully applied it to a number of different application server implementations. Finally in this section information is given on the performance overhead incurred by applying COMPAS JEEM to a J2EE application. Section 7 and 8 give our plans for future work and conclusions respectively. 2. Motivation and Considerations for Non-Intrusive Run-Time Path Tracing 2.1. Motivation for Run-Time Path Tracing A run-time path shows how a user request is serviced, giving the ordered sequence of events/calls and related performance and resource usage information. Run-time path analysis can help greatly with system comprehension and can be applied for a number of different purposes in this area. In the following paragraphs we give the motivation for our research by detailing some of the different ways in which runtime paths can be applied. System comprehension can be a major issue for developers of large enterprise applications. A lack of a system wide understanding can lead to badly designed systems which perform poorly and can be difficult to debug and maintain. In section 6.3 we show how run-time paths can be used to quickly deduce the overall system structure of enterprise applications. Analysing run-time paths to deduce system structure is advantageous for a number of reasons. Firstly we found that this approach was a lot faster
than analysing the source code to deduce a high level understanding of the system structure. Analysing application source code can be cumbersome since a large number of files can be involved. Even when using code comprehension tools (e.g. [12]) determining the chain and order of method calls by stepping through source code can be a complex and time consuming task. In addition, some paths might be impossible to determine from static analysis, in particular in systems which contain workflow engines that select execution targets based on runtime conditions. COMPAS JEEM presents all calls from the main software components that make up a user request in a single run-time path. It maintains the order of calls and the call sequence can be easily observed by traversing the run-time path tree-like structure (see figure 5). Another advantage of using run-time paths, for system comprehension, is that they can be well represented in a diagrammatic format. The diagram construction process can be easily automated reducing the effort required on the part of the developer further [13]. As shown in section 6.3, such diagrams are useful for identifying performance design issues or antipatterns [14] that might exist in enterprise applications. Chen et al. [4] have applied run-time paths to deal with fault detection and diagnosis in large distributed systems. To detect faults they characterize distributions for normal paths and look for statistically significant deviations that might suggest failures. To isolate faults to the components responsible (diagnosis) they search for correlations (using data mining and machine learning techniques) in the run-time paths between component use and failed requests. They have also applied run-time path analysis to assess the impact of faults and to help with system evolution. Traditional approaches to this problem have relied on static dependency models. However such dependency models often do not capture the dynamic nature of today’s systems. Chen et al. use dynamic tracing (i.e. run-time paths) to capture the dynamic nature of today’s constantly evolving internet systems. Automatic design analysis is another area in which run-time paths have be applied to help gain a better understanding of enterprise systems. Recent work in this area [15] [16] details how advanced analysis techniques can be applied to run-time paths to automatically identify instances of inefficient design (i.e. performance design antipatterns). Data mining algorithms can be applied to run-time paths to find patterns of interest and to identify relationships between the different software components that suggest inefficient design choices have been made. For example in [3] association rule mining is applied to dynamic event traces to highlight the occurrence of inefficient design in a J2EE system which leads to increased network traffic. The literature [17] [18] presents the use of run-time paths to address performance issues. The Magpie project [17] collects run-time paths for distributed systems. It also measures resource consumption. Magpie then builds stochastic workload models suitable for performance prediction, tuning and diagnosis.
Larus [18] has used Whole Program Paths to find hot subpaths, which are heavily executed sequences of code that should be the focus of performance tuning and optimization. Finally, another application of run-time paths (outside the area of system comprehension) can be seen in recent work on autonomic monitoring [19]. Here Mos and Murphy introduce an adaptive monitoring framework whereby instrumentation is performed by low-overhead monitoring probes which are automatically activated and deactivated based on run-time conditions. The approach makes use of dynamic models to activate and deactivate monitoring probes, such that the monitoring overhead is kept to a minimum under normal conditions. If a there is a sudden degredation in system performance monitoring probes are automatically activated to identify the bottleneck component(s). The dynamic models are utilised to determine which monitoring probes should be activated to identify the bottleneck. The dynamic models consist of the monitored components and the dynamic, ordered relationships between them (i.e. the dynamic models are equivalent to run-time paths). 2.2. Considerations for Run-Time Path Tracing Today’s enterprise applications are generally required to handle high loads of concurrent users. Runtime path tracing in such systems involves tracing each of these user requests as they pass through the different software tiers that make up the entire application. To achieve this user requests must be identifiable across the entire request. The order of the calls that make up the requests also needs to be maintained. A system needs to meet the following requirements to be able to perform run-time path tracing: R1
the system must be able to identify new user requests entering the system
R2
the system is required to tag the request with request specific information (RSI) so that calls can be mapped to the originating requests
R3
the system requires the ability to piggy back the RSI with the request, such that the request can be tracked across the entire system
R4
a monitoring framework that provides for request interception is required
Next we detail how run-time path tracing can be achieved through the above requirements. The user request identification mechanism (R1) is required to determine when a new request enters the system such that it can be (R2) tagged with request specific information (RSI). The RSI contains a unique id so that the request can be identified and distinguished from other concurrent requests in the system. The RSI also contains a sequence number that is used to order the calls that make up the request. The sequence number is incremented (by the system’s run-time path ordering logic) every time a component method is called and is reset upon a new user request.
User requests also need to be tracked (R3), so that, as the request passes through the different software components that make up the system, it can be identified and the order of its calls maintained. In order to be able to track the request the RSI must be attached to it, in such a way as to be available at every call point along the request. This can be particularly challenging when a system is distributed across a network. A monitoring framework (R4) is required to intercept and log calls made to the software components that make up the request. The RSI, along with component details are logged upon a method invocation. The component details describe the component that is called, giving information such as what component method was invoked, what arguments were passed etc. The logged information can be subsequently used to construct the run-time paths offline. The reconstruction process makes use of the RSI data to (a) determine what calls make up each run-time path and (b) to order the calls. Thus far, the above requirements have generally been met in a manner that involves instrumenting either the software application itself or the underlying middleware [6] [11]. Manual instrumentation of the application source is undesirable since instrumenting the code can be time consuming and cumbersome and the effort must be repeated for each new system under development. Since the J2EE specification does not explicitly address the requirements for run-time path tracing, middleware layer approaches for J2EE have involved using non-standard mechanisms provided by specific vendors that are not portable across the technology and that often require the source code of the server to be available (such that it can be recompiled). For example the JBoss group recognised the need for R4 by providing its interceptor-based component architecture [20] which allows for the interception of calls to business tier J2EE components. IBM has also recently addressed the need for R3 through its Work Area Service [21]. The biggest issue with using non-standard mechanisms, however, is that developers get locked in to using a particular middleware implementation and lose the flexibility of being able to quickly change their underlying middleware implementation should the need arise. Consequently, such approaches can not be applied to systems that are made up of application servers from numerous different vendors. In the following sections we present an approach that, through the COMPAS monitoring framework, takes advantage of a number of J2EE standard mechanisms to meet the requirements outlined above for J2EE systems in a non-intrusive and portable manner. While our implementation focuses on the J2EE technology, only the framework instrumentation process and request tracking are J2EE specific, the remainder of the work presented below can be applied to other component technologies (e.g. CCM, .NET). This is discussed in detail in section 5.4.
3. J2EE Overview The Java 2 Enterprise Edition (J2EE) [22] defines a standard for developing multi-tier enterprise applications. It provides an architectural framework on four tiers that developers can use to build their enterprise systems upon (see figure 1). Our non-intrusive end to end tracing approach relies on monitoring all server side tiers using standard J2EE mechanisms. 3.1. Web Tier The J2EE web tier provides a run-time environment (or container) for web components. J2EE web components are either servlets or pages created using the Java Servlet Pages technology (JSPs) [23]. Servlets are Java programming language classes that dynamically process requests and construct responses. They allow for a combination of static and dynamic content within the web pages. JSP pages are text-based documents that execute as servlets but allow a more natural approach to creating the static content as they integrate seamlessly in HTML pages. JSPs and Servlets execute in a web container and can be accessed by clients over HTTP (e.g. a web browser). The servlet filter technology is a standard J2EE mechanism that can be applied to components in the web tier to implement common pre and post-processing logic. It is discussed in detail in section 5.1. 3.2. Business Tier Enterprise Java Beans (EJBs) [24] are the business tier components and are used to handle business logic. Business logic is logic that solves or meets the needs of a particular business domain such as banking, retail, or finance for example. EJBs run in an EJB container and often interact with a database in the EIS tier in order to process requests. Clients of the EJBs can be either web components or stand alone applications. EJB is the core of the Java 2 Enterprise Edition (J2EE) platform and provides a number of complex services such as messaging, security, transactionality and persistence. These services are provided by the EJB container to any EJB component that requests them in the associated XML deployment descriptor. XML deployment descriptors contain metadata that associates both structural and behavioural information to a particular component. An XML deployment descriptor must be associated with each component in the business tier as mandated in the EJB specification [24]. This information can be utilized to add monitoring and functionality to business tier components (see section 4.1). 3.3. Enterprise Information System Tier Enterprise information systems provide the information infrastructure critical to the business processes of an enterprise. Examples of EISs include relational databases, enterprise resource planning (ERP)
systems, mainframe transaction processing systems, and legacy database systems. The J2EE Connector architecture [25] defines a standard architecture for connecting the J2EE platform to heterogeneous EIS systems. For example a JDBC Connector is a J2EE Connector Architecture compliant connector that facilitates integration of databases with J2EE application servers. Java Database Connectivity (JDBC) [26] is an API and specification to which application developers and database driver vendors must adhere. Relational Database Management Systems (RDBMS) vendors or third party vendors develop drivers which adhere to the JDBC specification. Application developers make use of such drivers to communicate with the vendors’ databases using the JDBC API. The main advantage of JDBC is that it allows for portability and avoids vendor lock-in. Since all drivers must adhere to the same specification, application developers can replace the driver that they are using with another one without having to rewrite their application. 4. COMPAS Monitoring Framework COMPAS [9] is intended as a foundation for building enterprise-level performance management solutions for component-based applications. It was designed to provide a common monitoring platform across different application server implementations. This is achieved by leveraging the underlying properties of component-based platforms in order to enable non-intrusive instrumentation of enterprise applications. Two main techniques are used to minimise the overhead of the monitoring infrastructure: asynchronous communication and adaptive activation. The former is employed in the entire infrastructure by the use of an event-based architecture with robust message handling entities that prevent the occurrence of locks in the target application. The latter technique uses execution models captured from the target application to drive the activation and deactivation of the monitoring probes. By appropriately minimising the number of active probes (i.e. through the alert management and adaptive monitoring process), the total overhead is reduced while preserving complete target coverage (see [19] [9]). As the COMPAS infrastructure is designed to be used as a foundation for performance management tools, its design is highly extensible and based on decoupled communication mechanisms. The most important functional entity of the monitoring infrastructure is the monitoring probe. The probe is conceptually a proxy element with a 1 to 1 relationship with its target component. In J2EE, target components are JSPs, Servlets, EJBs and EIS components deployed in a target application. The probe is implemented as a proxy layer surrounding the target component with the purpose of intercepting all method invocations and lifecycle events. The process of augmenting a target component with the proxy layer is referred to as probe insertion. This section focuses on the core COMPAS capabilities, initially limited to EJB components only, whereas the following sections present the extensions that
allow COMPAS to instrument and monitor the remaining J2EE target components, grouped under the name COMPAS JEEM. 4.1. Portable EJB Instrumentation COMPAS uses component meta-data to derive the internal structure of the target entities. The component metadata is placed in deployment descriptors that contain structural as well as behavioral information about the encompassing EJBs. By leveraging this data, it is possible to obtain the internal class-structure of each component, which is needed for instrumentation. The instrumentation is performed by a ”proxy layer” attached to the target application through a process called COMPAS Probe Insertion (CPI). As all the information needed for probe insertion is obtained from the meta-data, there is no need for source code or proprietary application server hooks. Therefore, the effect on the target environment is minimal and user intervention in the probe insertion process not required. COMPAS is in this respect non-intrusive, as it does not require changes to the application code or to the runtime environment as the majority of other approaches do. The CPI process examines the Target Application’s structure and uses component metadata to generate the proxy layer. For each component in the target application, a monitoring probe is specifically generated based on the component’s functional and structural properties. The CPI process leverages deployment properties of contextual composition component-frameworks [27] to discover and analyse target applications. Therefore, CPI is conceptually portable across component frameworks such as EJB, .NET or CCM. Indeed, given a component system written for any such platforms, COMPAS would be able, with minimal modifications, to insert monitoring probes that match each of the application components. For components written for the EJB platform, the following metadata is extracted and used to generate a monitoring probe for a target component (TC): •
Component Name (bean name, for EJB)
•
Component Interface (Java interface implementing the services exposed to clients, for EJB)
•
Locality (local or remote, for EJB)
•
Component Type (stateless session, stateful session, entity or message-driven, for EJB)
•
Component Interface Methods (Java methods in the business interface, for EJB)
•
Component Creation Methods (ejbCreate(. . . ) methods, for EJB)
COMPAS provides a Probe Code Template (PCT), modifiable by the user, which represents a template for the generated monitoring probes. It consists of extensible logic for initiating event-handling operations and placeholders for component-specific information. Using the PCT and the extracted metadata, the CPI process generates one probe for each Target Application Component. The placeholders in the template
are replaced with the values extracted from the metadata. The proxy layer is an instantiation of the PCT, using the TC metadata values. The proxy layer (probe) is a thin layer of indirection directly attached to the TC. To fulfill its instrumentation functionality, the Probe employs the Instrumentation Layer that has the capability of processing the data captured by the Probe and performing such operations as event notifications. The Instrumentation Layer uses the COMPAS Probe Libraries for implementing most if its logic. A Modified Component (MC) results after the CPI process has been applied to a TC, and this will enclose the original TC. In addition, it will contain the Probe and Instrumentation Layer artifacts. In order to ensure a seamless transition from the TC to the MC, the CPI transfers the TC metadata to the MC. The MC metadata will only be updated so as to ensure the proper functionality of the proxy layer (the bean class property must be updated to indicate the insertion of the Probe class). The CPI process is illustrated in figure 2. At run-time, probes collect and analyse performance and lifecycle events from the target components and communicate with the central monitoring authority, the COMPAS Monitoring Dispatcher. The Monitoring Dispatcher is the client-side entity responsible for mediating client access to the COMPAS probes by providing an abstraction layer over the lower-level communication and control operations. It contains handlers for efficient processing and transformation of probe notifications into COMPAS events. In addition, the Monitoring Dispatcher provides a control interface that allows transparent relaying of commands to the monitoring probes. The central role of the Monitoring Dispatcher is client-side data processing as illustrated in figure 3. 4.2. COMPAS Extension Points COMPAS Monitoring contains an instrumentation core and a set of extensions for coordinating and handling instrumentation events. The extensions are built upon the pluggable architecture of the instrumentation core by leveraging the COMPAS Framework Extension Points (FEPs) based on loosely coupled asynchronous communication. Possible extensions that can be added to COMPAS include support for low-level instrumentation sources such as virtual machine profiling data, as well as high-level functional extensions such as elaborate data processing capabilities for complex analysis of the monitoring data. Complex decision policies for improving the alert management and adaptive monitoring process can also be implemented as extensions. Since the COMPAS infrastructure spans all the J2EE server-side tiers and the client tier, it provides two types of framework extension points: •
Server-Side FEP : facilitates the extension of the functionality available to or provided by a COMPAS Monitoring Probe. In addition, it facilitates the creation of new probe types. The majority of
extensions introduced by JEEM use server-side FEPs. In particular, JEEM makes use of Servlet filter probes and JDBC driver probes to feed information into the COMPAS infrastructure. Apart from these probes in the web and EIS tiers, JEEM also enhances existing EJB probes in order to generate appropriate call-path extraction information. Other examples of server-side FEPs include better time-stamp extraction techniques or advanced anomaly detection algorithms used in the COMPAS problem diagnosis processes [9]. •
Client-Side FEP : facilitate the extension of the functionality available to or provided by the COMPAS Monitoring Dispatcher. The run-time path construction and analysis module introduced by JEEM uses such a client-side extension and benefits from the information made available by the distributed infrastructure to the client-side. Other examples of extensions that can be added using client-side FEPs include specialised GUI consoles or integration into wider-scope performance tools.
As indicated, the additional functionality that COMPAS JEEM provides over the basic COMPAS framework uses several FEPs, across all J2EE tiers. The following sections present in detail these JEEM extensions. 5. COMPAS Extensions As detailed in section 4 the current implementation of the COMPAS framework has the ability to monitor the business tier of J2EE applications. We have made a number of extensions to the COMPAS framework using the COMPAS FEPs to allow for the tracing of run-time paths across all tiers of a J2EE system for multi-user workloads. The extensions made fall into three different categories, namely, monitoring, run-time path tracing and analysis. The overall architecture of the COMPAS JEEM is shown in figure 3. 5.1. Monitoring Software monitoring can be either intrusive or non-intrusive. Intrusive monitoring requires either instrumenting the source code of the application being monitored or instrumenting the middleware that the application is running on. There are a number of intrusive approaches that can trace system wide paths for J2EE systems [4] [5] [6]. Manual instrumentation of the application source code can be time consuming and cumbersome and consequently most of the intrusive approaches for monitoring J2EE system run-time paths focus on instrumenting the middleware. For example, Chen et al. instrument the JBoss application server [28] using the JBoss interceptor concept [20]. A major drawback of this approach and intrusive approaches in general is that they are not portable across different application servers. The Java Virtual Machine Profiler Interface (JVMPI) [29] is used by most of today’s Java profilers to collect performance metrics on running Java applications. The JVMPI (replaced by the Java Virtual
Machine Tools Interface [30] in Java 1.5) allows tool developers to plug into the Java Virtual Machine and obtain information on the running system. Although the JVMPI approach is non-intrusive since neither the source code of the application nor the source code of the middleware need to be modified it requires the setting of special flags within the Java Virtual Machine. The information obtained through the JVMPI is also very low-level and is unsuitable for tracing system wide run-time paths at component level. Agarwal et al. [31] use a data mining approach to extract resource dependencies from monitoring data. The approach is non-intrusive since the data required to extract the dependencies is obtained from existing system monitoring data. Their approach relies on the assumption that most system vendors provide a degree of built in instrumentation for monitoring. A major drawback of this approach however is that it is statistical and not exact, and at higher loads the number of false dependencies increase significantly. To the best of our knowledge, our approach is the first non-intrusive approach that allows for the tracing of system wide J2EE run-time paths. Basically it works by first intercepting the calls made to the different components that make up the system and second, recording their sequence. The interception of calls is performed by extending the COMPAS framework, using server-side FEPs, so that monitoring on all three server side tiers is performed. COMPAS uses the idea of a proxy layer to monitor EJB components [19]. The proxy layer is a thin layer of indirection directly attached to the EJBs that captures any requests to and responses from the components. When extending COMPAS we applied the idea of intercepting requests to and responses from components to the web and EIS tiers. This is performed in the web tier using intercepting filters and in the EIS tier using wrapper components. The EJB probes, intercepting filters and wrapper components can also be referred to as Interception Points (IPs). 5.1.1. Intercepting Filters: A chain of pluggable filters can be created to implement common pre and post-processing tasks during a web page request by implementing the Intercepting Filter pattern [32]. This patterns allows for the creation of pluggable filters to process common services in a standard manner without requiring changes to core request processing code. The filters intercept incoming requests and outgoing responses, which allows for pre and post-processing respectively. The servlet 2.3 specification [33] includes a standard mechanism for building filter chains based on the Intercepting Filters pattern. Consequently web applications running on servlet specification compliant web servers can be augmented with filters for pre and post processing without the need to modify source code. Next we briefly detail the Intercepting Filter pattern which we use to monitor the web tier of J2EE applications. A UML sequence diagram of the Intercepting Filter pattern is shown in figure 4.
The pattern contains the following participants: the Client, the FilterManager, the FilterChain, one or more Filters and finally the Target resource. The FilterManager sits between the client and the Target resource. When the client makes a request for the Target resource the FilterManager intercepts this request and creates a FilterChain. A FilterChain is an ordered collection of independent Filters. The Filters perform the pre and post processing logic and are mapped to a particular Target resource. The FilterManager next makes a request to the FilterChain to process the filters and subsequently the Target resource. The Intercepting Filter pattern can be implemented using a number of different strategies. Servlet specification compliant web servers implement the Intercepting Filter Standard Filter Strategy [32] whereby filters can be added declaratively through XML deployment descriptors. The web server container contains the FilterManager and is responsible for creating the different FilterChains that can be specified through the XML deployment descriptors. To monitor the web tier we have implemented a monitoring filter. The monitoring filter intercepts incoming requests and outgoing responses and thus can log the beginning and end of user requests. The monitoring filter can be added to the web application under test automatically (see section 5.1.3 below). Before the servlet 2.4 specification [33] filters could only be applied to the client request. Filters could not be applied to inter servlet communications. As a result the web tier was monitored as one software component on web servers prior to 2.4. However the 2.4 servlet specification allows for filters to be applied to calls between servlets (i.e. forwarded requests) [34] and thus our monitoring filters can collect the entire web tier run-time path trace for servlet specification 2.4 compliant web servers. Since this has been mandated by the servlet specification all future J2EE environments will support this feature. 5.1.2. EIS Tier Wrapper Components: The monitoring of the EIS tier is performed using wrapper components. This approach takes advantage of the proxy design pattern [35]. EIS tier monitoring works by wrapping the JDBC database driver with a monitoring layer. JDBC drivers implement the JBDC interfaces and classes of the JDBC API. For each original JDBC driver class, a wrapper class that exposes the same interface as the driver class is created. Rather than registering the JDBC driver class with the application server as normal, the packaged wrapper classes are registered with the application server. Calls made to the database layer from the application server are thus intercepted by the wrapper classes. The original JDBC driver is instead registered with the wrapper classes (declaratively using a properties file). Any calls made to the wrapper classes can thus be delegated to the original driver classes. The wrapper classes contain monitoring logic that captures all JDBC requests and responses. The wrapper classes have been implemented by extending the P6Spy open source framework [10]. As outlined above COMPAS JEEM’s monitoring framework consists of a number of IPs. The IPs
capture performance metrics and resource usage information along with the control flow data required for run-time path tracing. 5.1.3. Adding The Monitoring Framework to a J2EE Application: Adding the aforementioned filters and wrapper components does not require the server code or the application source code to be available nor does it require special JVM hooks. Instead we leverage standard J2EE mechanisms and make use of the component metadata that is stored in XML files as mandated by the J2EE specification. The monitoring components can be automatically added to the packaged application. The basic instrumentation script (i.e. the CPI process) provided by the original COMPAS monitoring framework has been extended to (1) unpack the entire J2EE application, (2) parse the XML deployment descriptors of the application (to obtain structural and functional information on the different components that make up the system) and (3) repackage the application with the COMPAS JEEM monitoring components inserted. 5.2. Run-time Path Tracing The ability to trace system wide run-time paths for multi-user workloads has been added to the capabilities of COMPAS JEEM. A generic run-time path tracing approach has been outlined in section 2.2. This approach has previously been implemented by Chen et al. in the Pinpoint open source project[4] [11]. However, a major drawback of the pinpoint implementation is that it relies on instrumentation of the middleware and it is therefore tied to its target application server, JBoss. In the following sections we detail the Pinpoint implementation of the above approach and how we applied it in a non-intrusive capacity. 5.2.1. Pinpoint - Intrusive run-time path Tracing: Pinpoint is a framework for problem determination in internet service environments. It includes a runtime path tracer for the JBoss application server. The requirements R1 R3 and R4 outlined in section 2.2 have been met in the Pinpoint implementation through instrumentation of the middleware. The following areas in the middleware were instrumented: the web server, the application server, a Remote Method Invocation (RMI) library and the JBoss JDBC Resource Adaptor [20]. Next we briefly detail the above instrumentation points, and their responsibilities. R2 is the only requirement that has been met in a non-intrusive capacity by the pinpoint framework and is described in section 5.2.2. User requests enter the system in the web tier. Pinpoint instruments both the http server and the servlet container. The instrumentation in the http server is responsible for determining when a new request enters the system (R1). It injects a unique ID (R2) value into the local thread of the new incoming request. The servlet container instrumentation is responsible for logging calls to and responses from each JSP/servlet (R4). The servlet container instrumentation is also responsible for the runtime path ordering logic in the
web tier. The application server instrumentation is performed in the Pinpoint project using JBoss interceptors [20]. The instrumentation logs calls to and from EJB components (R4) and performs runtime path ordering logic when a method is called. The JDBC Resource Adaptor instrumentation has similar responsibilities in relation to JDBC calls. Pinpoint also allows for tracing of run-time paths that span a number of threads (which for example might be caused by a remote method invocation) by instrumenting a JBoss RMI library such that the unique request ID is marshalled across the remote call from the local thread to the new (remote) thread (R3). 5.2.2. Non-Intrusive Run-time Path Tracing: We require a non-intrusive approach for run-time path tracing. In the following paragraphs we discuss how all the requirements (outlined in section 2.2), monitoring (R4), user request identification (R1), user request tagging (R2) and user request tracking (R3) can be met in a non-intrusive capacity. An application can be easily monitored (R4) by instrumenting middleware. Where the middleware source code is available it can be instrumented and recompiled such that the components running on top of the middleware are monitored and the required information logged (for instance performance metrics). The issue of monitoring the components in a non-intrusive manner is overcome using the COMPAS framework along with the monitoring extensions that we introduced above in section 5.1. The run-time path tracing approach discussed in section 2.2 requires user request tagging (R2), i.e. tagging each user request with a unique id. User requests are tagged by inserting a unique value into the local thread object when a new user request enters the system. The pinpoint tracing library performs this in a non-intrusive manner: A ThreadLocal object (introduced into Java in version 1.2) is associated with each thread in the system. The ThreadLocal objects can be used to hold a particular value (or object) for the lifetime of the thread. The pinpoint tracing library can be used to inject a RequestInfo object into the ThreadLocal object. The RequestInfo object contains the RSI (i.e. the unique Id of the user request and the sequence number of the current method call). Determining when a new user request arrives into the system can be easily achieved when instrumenting the middleware. If user requests are restricted to entering the system in the web tier, then the HTTP server (a component of the web server) can be instrumented such that each new request for a http connection can be tagged. A new request for a http connection represents a new user request. However, determining new user requests non-intrusively, does not allow manipulation of the HTTP server. To overcome this we implemented a point of entry detection mechanism. Point of entry detection is required to find if a call to a particular component is the point of entry into the system i.e. is this call as a result of
a new client request. We have extended the Pinpoint tracing library to allow for point of entry detection (R1) in a non-intrusive manner. Our point of entry detection mechanism works by monitoring the depth of the calls made on the current thread. We have extended the pinpoint RequestInfo object to include a Depth field to monitor the level of the current method call. The depth value is incremented when entering a method and decremented upon method exit. A depth of zero indicates that the current method is the first method called in the run-time path and is thus the point of entry into the system. When a call is made to a method with a depth of zero a unique id can be assigned to that thread and remains associated with that thread until the call depth again reaches zero. Our point of entry detection mechanism is advantageous since (1) it is non-intrusive and (2) it can be applied to any of the tiers in a J2EE system and unlike with the pinpoint approach outlined above new user requests must not be restricted to the web tier. The point of entry detection logic can be inserted into any of the IPs in figure 3 to identify new requests entering the system at any of the different tiers. The remaining issue to be addressed in relation to non-intrusive run-time path tracing is user request tracking (R3). While intrusive approaches can afford the luxury of instrumenting RMI libraries to marshall request IDs across threads, this can not be achieved non-intrusively. Our approach instead makes the assumption that the web server and application server are co-located. The database can be deployed either locally or on a remote machine. This assumption is justifiable, since it is often the case that enterprise systems are deployed on co-located servers during development. It is common that such systems are only deployed in a distributed environment late in the development process when production testing takes place. In fact recently it is becoming more and more common for the web tier and business tier to be co-located even in (often clustered) operational environments. The web and business tier can be co-located to take advantage of faster local calls between the tiers. Thus, these environments typically contain multiple identical machines, each running an instance of the web server and the application server, optimised for local inter-communication. Servers that are co-located run on the same Java Virtual Machine (JVM). Thus, when servers are colocated, for a particular user request that consists of local calls, the same thread is used across the web server, EJB server and database driver. Thus using our approach the user request can be tracked across the different tiers, since the user request can be identified by its unique id stored in the ThreadLocal object. An issue arises, however, if the run-time path includes remote calls to components that are deployed locally. For instance a servlet may call an EJB that is deployed locally, through its remote interface. In this case it is possible that a new thread is spawned when the remote call occurs. In fact according to the RMI specification [36]: ”A method dispatched by the RMI run-time to a remote object implementation
may or may not execute in a separate thread.” In practice, however it seems that a new thread is not spawned when a remote call is made to a component that is in fact local to the callee. This is clearly shown in our results section below. We believe that application server vendors take this decision not to spawn new threads since spawning a new thread when invoking local components through their remote interfaces would most likely prove wasteful. The fact that new threads are not spawned allows for the tracing of user requests across all J2EE tiers using the approach we outlined above. Although less likely in current systems, it is not impossible for the application server instance and the web server instance to run on separate machines or in separate JVMs. In such cases, as well as in cases where application servers do spawn new threads when remote calls are made locally, we propose an alternative run-time path extraction approach. This approach is based on the COMPAS Interaction Recorder (IR) [9]. The main limitation of this approach is that it does not allow for simultaneous requests during the extraction process and thus can only be used in a controlled test environment. The IR is part of the original COMPAS monitoring framework. Essentially, it works by coordinating the ordering of received invocation events from the monitoring probes into interaction models (i.e run-time call paths). This is performed through the IRs Model Sequencer. The sequencer receives and stores processed invocation events from the monitoring probes via the Monitoring Dispatcher (see figure 3). The data carried by the invocation events includes the invoked method ID, invocation start time and invocation end time. The Model Sequencer constructs the interaction models by analysing the invocation start and end timestamps. Further detail on the COMPAS Interaction Recorder can be found in the literature [9]. The Interaction Recorder can thus be used as a work around for systems that are not co-located but run in a controlled test environment. We are currently working on a solution that will allow for call path extraction for (non-co-located) multi-jvm multi-user systems under load. Details on this solution are outlined in section 7. 5.3. Analysis COMPAS JEEM contains run-time path tracing logic which non-intrusively logs information when methods are invoked. The monitoring framework uses the COMPAS Monitoring Dispatcher to asynchronously log the information to a remote machine. This allows for the information to be stored and analysed off-line. Analysing the data remotely (as opposed to analysing it locally) reduces the performance impact on the monitored system. There are two main phases of information extraction from the collected data: run-time path construction and advanced run-time path analysis. It is important to note that the analysis part of the framework is completely independent of the approach used for run-time path extraction. The only requirement is
that the necessary information (see 5.3.1) is provided. 5.3.1. Run-Time Path Construction: On every invocation of a component method, information is logged. In the pinpoint framework this is known as an observation. The logged information contains the following data: •
user request (unique) id
•
sequence number
•
call depth
•
component details
•
performance data
Run-time paths are constructed by grouping all observations by user request id. Each such group represents an unordered list of the calls that make up the run-time path. Run-time paths are ordered according to the sequence number of each call. This completes the construction of the run-time paths. Further analysis can be applied to this information for a number of purposes. For example, by analysing the run-time paths developers can easily construct UML diagrams that can help to deduce the overall system structure (see section 6.3). Component relationships can be identified which helps developers gain a better understanding of their system. For instance, understanding inter-component relationships enables developers to anticipate potential conflicts and debug problems as well as allowing developers to reason about their system design (which in turn can have a major impact on system performance [37]). 5.3.2. Advanced run-time path Analysis: Once constructed, the run-time paths can be further analysed. Advanced run-time path analysis has been applied in a number of different areas (see section 2.1). The COMPAS JEEM framework can be easily extended further to allow tool users to apply advanced run-time path analysis techniques. 5.4. Framework Portability One of the main advantages of our approach is that it is portable across the J2EE spectrum of platforms and can be applied independent of the underlying middleware implementation. However much of the framework itself is portable across different component technologies (such as .NET or CCM). In fact only the framework instrumentation process (CPI) and the request tracking (using the ThreadLocal approach) are specific to J2EE. Therefore much of our implementation can be reused across technologies as long as an instrumentation process and request tracking approach can be provided. The CORBA Portable Interceptors [38], for example, allow for the insertion of custom monitoring code (i.e. instrumentation of the application) and the piggy backing of data along a request (which allows for request tracking). Thus our approach could also be applied in the CORBA environment by making use of the Portable
Interceptors provided for through the CORBA standard and by reusing much of our implementation to perform run-time path tracing (i.e. the tagging logic, the point of entry detection mechanism, the run-time path ordering logic, the monitoring dispatcher, and the run-time path analysis modules). 6. Results In this section we show how we applied our tool prototype to a number of J2EE applications and how we could quickly deduce the system structure of the applications by analysing the run-time paths collected. For one of the applications we show that we could easily identify a number of performance design antipatterns from the system structure that was deduced. We also show the portability of our approach by applying it to applications running on application servers from a number of different vendors. Finally we give details on the performance overhead incurred by applying COMPAS JEEM to a J2EE application. 6.1. Deducing System structure We applied COMPAS JEEM to two J2EE applications: (1) a sample online banking application, called Duke’s Bank and (2) a sample e-commerce application, called PlantsByWebsphere. Duke’s Bank is a sample application provided by Sun Microsystems as a showcase for the J2EE technology. PlantsByWebsphere is a sample e-commerce application provided by IBM. We monitored both applications using COMPAS JEEM and performed a number of user actions. The run-time paths that corresponded to each user action were obtained. A diagramatic representation of each system was constructed by analysing the run-time paths. Duke’s Bank is an online banking application. When a user logs in to the Duke’s Bank application he/she can perform the following actions: view a list of accounts, view an individual account’s details, withdraw or lodge cash, transfer cash from one account to another or finally log off. Each of the different user actions was performed and the run-time paths recorded. Figure 5 shows the run-time path associated with the account list user action. It shows the interactions that occur between components to satisfy a user request (for the list of accounts that belong to a particular customer). The run-time paths collected by COMPAS JEEM also contain related performance metrics (e.g. method execution time as shown in figure 5). Such information can be useful for identifying bottlenecks or performance design issues that exist in the application. Five different components make up the run-time path: accountList (web component), AccountControllerBean (session bean), AccountBean (entity bean), PreparedStatement (JDBC component) and ResultSet (JDBC component). The run-time path can be easily traversed to identify the relationships that exist between the different components. By analysing 6 run-time paths we manually produced the diagram in figure 6. Figure 6 shows the design of the section of the application which is responsible for handling the different requests that can
be made by banking customers that log in to the system. The diagram was constructed manually by traversing the run-time paths. The testing environment consisted of the JBoss application server (version 3.2.7) and a MySql database (version 4.0.2). Because the web server that comes as part of JBoss 3.2.7 implements the servlet 2.3 specification the web tier is monitored as one software component as shown in figure 6. The diagrams that we produced can be used for identifying performance design issues within enterprise applications. For example, by analysing figure 6 we quickly identified the existence of two EJB performance antipatterns in the Duke’s Bank application. The diagram consists of 4 EJBs two stateful session beans and 2 entity beans. •
Antipattern 1: The first antipattern we identified is known as the Conversational Baggage antipattern [14]. It identifies situations where a stateful session bean is used but not required. This can be wasteful since there is a performance overhead required to manage stateful sessions within the EJB container. Often a (more lightweight) stateless session bean would suffice. Figure 6 shows that a stateful session bean is invoked for each user request. However there is no obvious need for a stateful session bean in any of the user actions we performed, as state is not maintained across the different web pages returned for the related user requests. The stateful session beans in figure 6 could easily be replaced by more lightweight stateless session beans which is the suggested solution to this antipattern.
•
Antipattern 2: The second issue we identified with the Duke’s Bank application was that the entity beans exposed remote interfaces. The entity beans in this application (unlike the session beans) are only accessed by EJBs which would most likely be deployed in the same container. Thus there is no real need for the entity beans to expose remote interfaces. The beans can be refactored to expose local interfaces and thus reduce the overhead associated with remote method calls.
PlantsByWebsphere is a sample e-commerce application that comes installed with Websphere 6.1. Websphere 6.1 contains a websever that conforms to the servlet 2.4 specification and thus can monitor inter web component calls. As above we performed a number of user actions in the PlantsByWebsphere application and constructed a diagram by analysing the corresponding run-time paths. In total 19 user actions were performed. Figure 7 shows the diagram that was constructed. The diagram shows the different web and business tier components that made up the application (for simplicity we did not analyse the JDBC components for PlantsByWebsphere). As shown in figures 6 and 7 COMPAS JEEM monitors only the main components (JSPs, Servlets, EJBs and JDBC calls) that make up the J2EE system. Smaller utility classes and the underlying middleware calls are omitted. This allows for run-time paths (and thus diagrams) that contain only the main components of the application. As a result our run-time paths and diagrams are not cluttered
with unimportant method calls that play only a minor role in the overall application design structure. 6.2. Portability Assessment In order to show that our approach is truly portable we applied it to a number of application servers running J2EE applications. The table in figure 8 shows results from applying our tool to three different application servers, JBoss, Websphere and OC4J. The table gives the server vendor, the server version, results from monitoring the web tier and results from monitoring the business tier. In relation to web tier monitoring we tested to see if the interactions between web components could be monitored. From the results table it can be seen that only the Websphere application server could achieve monitoring of the individual web tier components. In JBoss and OC4J the web tier was monitored as one software component. This can be explained by the fact that both the JBoss and OC4J versions that we used during testing contain servlet containers that conform to the servlet 2.3 specification. Later versions of the application servers will conform to the servlet 2.4 specification and will allow for complete web tier monitoring as is the case with Websphere 6.1. The testing of the business tier was designed to determine if calls made to a component through a remote interface from a local component (i.e. within the same JVM) could be traced using our technique based on the thread preservation in EJB remote calls, outlined in section 5.2.2. For the three applications servers that were tested, it was found that in situations where components were located within the same JVM all calls made between components could be traced regardless of whether the calls were through local or remote interfaces. Even though the RMI specification stipulates that, ”a method dispatched by the RMI run-time to a remote object implementation may or may not execute in a separate thread”, our results suggest that in the case of current application servers a new thread is not spawned for such calls when the objects are co-located. Since this is the case the approach we outline for tracing run-time paths above can be applied. The monitoring of the EIS tier is achieved be extending the P6Spy open source monitoring framework. P6Spy can be used to monitor database statements of any application that uses JDBC [10], thus the EIS tier monitoring is portable across different database implementations and is independent of the web and application server tiers. 6.3. Performance Overhead To assess the performance overhead incurred by COMPAS JEEM we instrumented the web and business tiers of the PlantsByWebsphere application described in section and ran a number of performance tests. The test environment was made up of three machines. The first machine (Pentium 4, 2.25GHz, 1GB RAM) was used for load generation. Apache Jmeter was used as a load generation tool. PlantsByWebsphere was
installed on WebSphere 6.1 which ran on a second machine (Pentium M, 1.7GHz, 1GB RAM). Finally a third machine (Pentium 4, 1.6 GHz, 768 MB RAM) was employed to log the data produced by COMPAS JEEM remotely such that the performance overhead on the machine running the J2EE application could be kept to a minimum. This is common practice for production environments. To assess the performance overhead we ran a number of test cases with different user loads for two versions of the sample application (i.e. one version included COMPAS JEEM instrumentation and the other version did not). The test cases consisted of a warm up period (to warm up the JVM as suggested in the literature [39]), followed by a measurement period. During the different test cases we observed the response time of the application for the following loads: 20 users, 50 users, 100 users, 150 users, 200 users. Users entered the system at a rate of one user per second with a random delay of between 1-2 seconds. Each measurement period lasted 30 minutes. For each of the given loads three test cases were run and an average value obtained. To assess the performance overhead we analysed the average response time of the instrumented application versus the average response time of the non-instrumented application for the same user load. The results are shown in figure 9. Note that the response time for each run is an average value that has been normalised based on the average response time for 20 users (non-instrumented). At 20 and 50 users there was no noticeable difference (on average) in response time for either the instrumented or non-instrumented version of the application. Thus there was no noticeable overhead for these loads. At 100 users there was a 10% increase in response time for both the instrumented and non-instrumented versions compared with 20 users. Again there was no noticeable overhead that could be attributed to COMPAS JEEM. At 150 users, however the system was close to it’s saturation point and response time increased significantly. For the non-instrumented application response time increased by 310%. The instrumented version at 150 users showed an increase of 390%. Therefore there was a 19% overhead that could be attributed to COMPAS JEEM at 150 users. At 200 users (for the default application server settings) the system under test had reached its capacity for both the instrumented and non-instrumented application. There were a large number of errors at this user load and the results were ignored. As can be seen from the results above there is no obvious overhead when COMPAS JEEM is applied to a non-saturated system. This can be explained by the fact that COMPAS JEEM only monitors at the component level and does not instrument lower level utility classes (unlike most of today’s JVMPI [29] based monitoring tools). Adding a small number of monitoring probes is insignificant in the context of large enterprise applications that generally contain very large numbers of classes. The sudden increase in overhead at 150 users can be explained by the fact that at this point the system seems to be close to its saturation point, with response time over 4 times slower than at lower loads. Any further increase
in load at this point adds to the bottleneck and increases response time significantly. Further testing is required to assess if COMPAS JEEM will scale for increased loads when more hardware is available. 7. Future Work An effort is currently being undertaken to enhance our approach such that it can be applied to J2EE applications that are under load and distributed across multiple JVMs. Our current approach is limited to co-located web/application server environments for systems under load (although a work around has been provided for controlled environments). The reason for this current limitation is that J2EE (unlike CCM) does not provide a mechanism for piggy backing information with a remote request such that RSI can be passed over a remote invocation. Our solution to this problem is to add an additional parameter to the remote request, allowing the client to pass the additional request data through that additional parameter. This can be achieved by replacing the client-side stub with a custom wrapper such that the client’s request invocation can be intercepted before invoking the server side EJB. The custom wrapper intercepts the invocation and forwards on the additional parameter with the request. The client side stub can be replaced using a standard J2EE mechanism (the PortableRemoteObjectDelegate object). The PortableRemoteObjectDelegate object can be used to replace the method where the client obtains a reference to the stub object. By replacing this method it is possible to give the client a reference to a custom wrapper such that the invocation is intercepted and the RSI data sent along with the request. On the server side our monitoring probes can be modified to accept this extra parameter and to tag the new thread in the remote JVM with the RSI data. The PortableRemoteObjectDelegate object can configured as part of Java’s runtime environment. Therefore this approach does not require application or application server source code to be available and is completely portable across different application severs. Work is also being carried out in the area of run-time path analysis. Our previous work has shown how data mining can be applied to run-time paths to identify common software design mistakes [3]. This effort is currently being continued and further data mining techniques are being applied with the aim of finding patterns of interest in the run-time path data. In particular we are applying clustering techniques [40], association rule mining (ARM) [41] and sequential rule mining [42] to the data collected by COMPAS JEEM. Clustering techniques can be used for the division of data into groups of similar objects [43]. We intend to apply clustering to the run-time paths collected. This will allow us to group run-time paths that are similar and that might be associated with a particular user action thus reducing the number of run-time paths that are presented to the tool user. ARM has traditionally been used in market basket analysis to identify items that are frequently purchased together. We have previously shown how ARM
can be applied to run-time data to identify methods that are frequently used together [3]. Often these methods can be the focus of optimizations. Similarly, sequential rule mining can be applied to run-time paths. Sequential rule mining has been previously used to find frequently occurring patterns in data [42]. We intend to apply it to automatically identify frequently occurring patterns of method calls (e.g. loops) in the run-time paths. 8. Conclusions In this paper we present an approach for non-intrusive end to end monitoring of component-oriented systems, focused on J2EE. Our implementation combines and extends a number of open source projects (COMPAS, Pinpoint and P6Spy) to allow for system wide tracing of run-time paths. The paper shows how each of the projects has been integrated and extended. In particular we give details on the COMPAS monitoring framework which was originally implemented for EJB monitoring and show how it has been extended to monitor all server side tiers of a J2EE system. We also present how the pinpoint tracing library has been extended to allow for non-intrusive run-time path tracing. Our results section shows the performance overhead incurred by applying COMPAS JEEM to a J2EE application. This section also presents how COMPAS JEEM can be applied to a number of different application servers and thus shows the portability of our approach. We also show the utility of our tool by monitoring two J2EE applications. We explain how the collected data can be easily analysed and the overall system structure deduced. For one of the applications we showed how performance design issues could be easily identified from diagrams produced from the run-time paths. The future work section gives details on how the tool can be further extended such that it can be applied to web and application servers that are distributed across JVMs. In this section we also introduce a number of data mining techniques that are currently being investigated and that can be used to analyse the run-time paths collected by COMPAS JEEM. Related work is referred to and discussed throughout the document where appropriate. Acknowledgments We would like to thank Dr. Thomas Gschwind for his ideas on piggy backing data with remote invocations. We are currently working with Dr. Gschwind on implementing the related solution outlined in section 7. Our work is funded under the Commercialisation Fund from the Informatics Research Initiative of Enterprise Ireland. References [1] Keogh J.E.: “J2EE the Complete Reference”, Osborne,McGraw-Hill, September 2000
[2] Chen M., Kiciman E., Accardi A., Fox A. and Brewer E.: “Using runtime paths for macro analysis”, Proc. 9th Workshop on Hot Topics in Operating Systems, Lihue, HI, USA, May 2003 [3] Parsons T. and Murphy J.: “A Framework for Automatically Detecting and Assessing Performance Antipatterns in Component Based Systems Using Run-Time Analysis”, The 9th International Workshop on Component Oriented Programming, part of ECOOP, Oslo, Norway, 2004 [4] Chen M., Kiciman E., Fratkin E., Fox A. and Brewer E.: “Pinpoint: Problem Determination in Large, Dynamic, Internet Services”, Proc. Int. Conf. on Dependable Systems and Networks (IPDS Track), Washington, D.C., June 2002 [5] http://www.quest.com/performasure/, accessed June 2006 [6] Gschwind T., Eshghi K., Garg P. K. and Wurster K.:“WebMon: A Performance Profiler for Web Transactions”, Proc. Fourth IEEE International Workshop on Advanced Issues of E-Commerce and Web-Based Information Systems, Newport Beach, California, USA, June, 2002 [7] http://compas.sourceforge.net/, accessed June 2006 [8] Mos A. and Murphy J.:“Performance Management in Component-Oriented Systems using a Model Driven Architecture Approach”, Proc. 6th IEEE International Enterprise Distributed Object Computing Conference, Lausanne, Switzerland, September 2002 [9] Mos A.:“A Framework for Adaptive Monitoring and Performance Management of Component-Based Enterprise Applications”, PhD thesis, Dublin City University, Ireland, 2004 [10] http://www.p6spy.com/, accessed June 2006 [11] http://swig.stanford.edu/pinpoint.shtml, accessed June 2006 [12] Gorton I. and Zhu L.: “Tool Support for Just-In-Time Architecture Reconstruction and Evaluation: An Experience Report”, Proc. 27th International Conference on Software Engineering, St. Louis, Mousiri, USA, 2005 [13] Kamin S. and Hyatt D.: “A Special Purpose Language for Picture Drawing”, Proc. Conference of Domain Specific Languages, Santa Barbara, CA, USA, 1997 [14] Tate B., Clarke M., Lee B. and Linskey P.: “Bitter EJB”, Manning, 2003 [15] Parsons T. and Murphy J.: “The 2nd International Middleware Doctoral Symposium: Detecting Performance Antipatterns in Component-Based Enterprise Systems”, IEEE Distributed Systems Online, vol. 7, no. 3, March, 2006 [16] Parsons T.:“A Framework for Detecting Performance Design and Deployment Antipatterns in Component Based Enterprise Systems”, Proc. 2nd Int’l Middleware Doctoral Symp., ACM Press, Grenoble, France, 2005, art. no. 7 [17] Barham P., Donnelly A., Isaacs R. and Mortier R.:“Using Magpie for request extraction and workload
modelling”, Symposium on Operating Systems Design and Implementation, San Francisco, CA, USA, December 2004, pp 259–272 [18] Larus J.R.: “Whole Program Paths”, ACM SIGPLAN Conference on Programming Language Design and Implementation, Atlanta, GA, USA, May, 1999, pp 259-269 [19] Mos A. and Murphy J.:“COMPAS: Adaptive Performance Monitoring of Component-Based Systems”, Proc. Workshop on Remote Analysis and Measurement of Software Systems at 26th International Conference on Software Engineering, Edinburgh, Scotland, May 2004 [20] Kunnumpurath M.: “JBoss Administration and Development Third Edition (3.2.x Series)”, Apress, October 2003 [21] Kovari P., Cerecedo Diaz D., Fernandes F. C. H., Hassan D., Kawamura K., Leigh D., Lin N., Masicand D., Wadley G. and Peter Xu: “WebSphere Application Server Enterprise V5 and Programming Model Extensions WebSphere Handbook Series”, September, August, 2003 [22] http://java.sun.com/j2ee/, accessed June 2006 [23] http://java.sun.com/products/servlet/, accessed June 2006 [24] http://java.sun.com/products/ejb/docs.html, accessed June 2006 [25] http://java.sun.com/j2ee/connector/, accessed June 2006 [26] http://java.sun.com/products/jdbc/, accessed June 2006 [27] Szyperski C., Gruntz D. and Murer S.: “Component Software: Beyond Object-Oriented Programming”, Addison-Wesley, November, 2002 [28] http://www.jboss.org, accessed June 2006 [29] http://java.sun.com/j2se/1.4.2/docs/guide/jvmpi/jvmpi.html, accessed June 2006 [30] http://java.sun.com/j2se/1.5.0/docs/guide/jvmti/jvmti.html, accessed June 2006 [31] Agarwal M. K., Gupta M., Kar G., Neogi A. and Sailer A.:“Mining Activity Data for Dynamic Dependency Discovery in e-Business Systems”, IEEE eTransactionson Network and Service Management Journal, Vol.1 No.2, September, 2004 [32] Alur D., Crupi J. and Malks D.: “Core J2EE Patterns: Best Practices and Design Strategies”, Prentice Hall, Sun Microsystems Press, 2001 [33] http://java.sun.com/products/servlet/download.html, accessed June 2006 [34] http://www.javaworld.com/javaworld/jw-03-2003/jw-0328-servlet.html, accessed June 2006 [35] Gamma E. and Helm R. and Johnson R. and Vlissides J.: “Design Patterns: Elements of Reusable Object-Oriented Software”, Addison-Wesley, 1995 [36] http://java.sun.com/j2se/1.4.2/docs/guide/rmi/, acceessed June 2006 [37] Cecchet E., Marguerite J. and Zwaenepoel W.:“Performance and Scalability of EJB Applications”,
Proc. 17th ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications, Seattle, Washington, USA, 2002, pp 246-261 [38] http://www.omg.org/technology/documents/, accessed June 2006 [39] Buble A., Bulej L. and Tuma P.: “CORBA Benchmarking: A Course with Hidden Obstacles”, Proc. IPDPS Workshop on Performance Modeling, Evaluation and Optimization of Parallel and Distributed Systems, Nice, France, 2003. [40] Jain A. K., Murty M. N. and Flynn P. J.:“Data clustering: a review”, ACM Computing Surveys, vol. 31, no. 3, 1999, pp 246-323 [41] Agrawal R.,Imielinski T. and Swami A. N. “Mining Association Rules between Sets of Items in Large Databases”,Proc. ACM SIGMOD International Conferenceon Management of Data, Wasington D.C., USA, 1993, pp 207-216 [42] Mannila H., Toivonen H. and Verkamo A. I.:“Discovery of Frequent Episodes in Event Sequences”, Data Mining and Knowledge Discovery, vol. 1, no. 3., 1997, pp 259-289 [43] Hand D., Mannila H. and Smyth P.:“Principles of Data Mining”, MIT Press, 2001
Figures
Fig. 1 Typical J2EE Architecture
TC +
PCT
MC =
(CPI)
Probe TC
Probe
Instrumentation Layer COMPAS Probe Libraries network
COMPAS Monitoring Dispatcher Fig. 2 COMPAS Probe Insertion Process
Fig. 3 COMPAS JEEM Architecture
Fig. 4 Intercepting Filter
Fig. 5 Run-time path and UML sequence diagram
Fig. 6 Diagram showing components in Dukes Bank
Fig. 7 Diagram showing components in PlantsByWebsphere
Server Vendor JBoss Group IBM Oracle
Version JBoss 3.2.7 Websphere 6.1 OC4J 10.1.2.0.2
Web Tier Monitored as one component Monitored as multiple components Monitored as one component Fig. 8 Portability Test Results
Business Tier Remote and Local Components Monitored Remote and Local Components Monitored Remote and Local Components Monitored
Application Version Non-Instrumented resp. time Instrumented resp. time Overhead
20 Users 1 1 0
50 Users 1 1 0
100 Users 1.1 1.1 0
150 Users 4.12 4.9 19%
Fig. 9 Performance Overhead Test Results
200 Users X X X