the rewriting of Java applications and JDK classes, for the insertion of CPU accounting instructions, as well as the kind of manipulations that are performed to ...
UNIVERSITY OF GENEVA FACULTY OF SCIENCES DEPARTMENT OF COMPUTING SCIENCE
IMPLEMENTATION OF CPU RESOURCE ACCOUNTING FOR JAVA
MASTER THESIS
By Rory Gavino VIDAL BURGOA
Director of M. Thesis:
Prof. J¨ urgen HARMS
Supervisors:
Alex VILLAZON Walter BINDER
University of Geneva, Departement of Computing Science 24, rue G´en´eral-Dufour, 1211 Geneva 4 Switzerland
July 2001
To my dear family.
ii
Table of Contents Table of Contents
iii
List of Tables
v
List of Figures
vi
Acknowledgements
vii
Abstract
viii
Introduction
1
1 Extracting CPU resource information 1.1 Byte-code instrumentation . . . . . . . . . . 1.1.1 Byte-code engineering . . . . . . . . 1.2 Basic accounting definitions . . . . . . . . . 1.2.1 Accounting blocks . . . . . . . . . . 1.2.2 Accounting object . . . . . . . . . . 1.3 Rewriting rules . . . . . . . . . . . . . . . . 1.3.1 Method modifications . . . . . . . . 1.3.2 Native Methods and JDK rewriting. 1.4 Summary . . . . . . . . . . . . . . . . . . . 1.4.1 Synopsis of rewriting rules . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
5 6 6 9 9 10 12 12 15 21 21
2 Implementation of the rewriting tool 2.1 Byte-Code Engineering Library (BCEL) . . . . . . 2.1.1 Static part . . . . . . . . . . . . . . . . . . 2.1.2 Generic part . . . . . . . . . . . . . . . . . 2.1.3 Section Summary . . . . . . . . . . . . . . . 2.2 The rewriting tool. . . . . . . . . . . . . . . . . . . 2.2.1 Design of the rewriting tool. . . . . . . . . . 2.2.2 Summary of the rewriting process . . . . . 2.3 The accounting process for a method . . . . . . . . 2.3.1 Creation of the Control Flow Graph (CFG) 2.3.2 Adding accounting information . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
23 24 24 24 25 26 26 27 29 29 29
iii
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
2.4
2.3.3 Adjusting invocations in the method body. . . . . . . . . . 2.3.4 Adding account instructions . . . . . . . . . . . . . . . . . . 2.3.5 Creation of the new method with the accounting argument 2.3.6 Creating redirections and wrappers . . . . . . . . . . . . . . Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 Rewriting Optimization Algorithms. 3.1 Optimizations . . . . . . . . . . . . . 3.1.1 O1 . . . . . . . . . . . . . . . 3.1.2 O2 . . . . . . . . . . . . . . . 3.1.3 O3 . . . . . . . . . . . . . . . 3.1.4 O4 . . . . . . . . . . . . . . . 3.1.5 Combinations and heuristics . 3.2 Implementation of optimizations . . 3.2.1 O1 . . . . . . . . . . . . . . . 3.2.2 O2 , O3 . . . . . . . . . . . . 3.2.3 O4 . . . . . . . . . . . . . . . 3.2.4 Combinations . . . . . . . . . 3.3 Summary . . . . . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
31 33 34 35 36
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
37 37 37 38 38 38 41 42 43 43 43 43 43
4 Evaluations 4.1 CPU Rewriting Tool Evaluation . . . . . . . 4.2 Evaluation with optimizations . . . . . . . . . 4.2.1 Disabled Just-in-Time (JIT) compiler 4.2.2 Enabled Just-in-Time (JIT) compiler . 4.2.3 Estimation of the size overhead . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
45 45 48 48 50 52
5 Applications 5.1 A simple applet demo . . . . . . . . 5.2 Integration to JSEAL-2 mobile agent 5.2.1 JSEAL-2 Scheduler . . . . . . 5.3 Other possible applications . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
53 53 54 55 56
. . . . . system. . . . . . . . . . .
6 Conclusions and Future work A Rewriting Tool Execution A.1 Packages . . . . . . . . . A.2 Usage . . . . . . . . . . A.2.1 analyzer . . . . . A.2.2 cpu . . . . . . .
. . . .
. . . .
57
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
Bibliography
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
59 59 59 59 60 61
iv
List of Tables 1.1
The CPUAccount implementation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
11
1.2
CPU accounting in a method. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
13
1.3
Rewriting two versions for each method . . . . . . . . . . . . . . . . . . . . . . . . .
14
1.4
Solving size overhead using wrappers. . . . . . . . . . . . . . . . . . . . . . . . . . .
15
1.5
Wrappering native methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
16
1.6
Second approach for native methods: Using structure analysis . . . . . . . . . . . . .
17
1.7
Rewriting of abstract methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
18
1.8
Rewriting constructors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
21
2.1
Summary of transformations of a class. . . . . . . . . . . . . . . . . . . . . . . . . . .
36
3.1
Basic accounting code (6 instructions) . . . . . . . . . . . . . . . . . . . . . . . . . .
39
3.2
Updating local variable and usage accounting value (8 instructions) . . . . . . . . .
40
3.3
Updating local variable and setting usage value(4 instructions) . . . . . . . . . . . .
41
4.1
Overhead of CPU accounting (time in seconds). JIT and volatile disabled. . . . .
46
4.2
Overhead of CPU accounting (time in seconds). JIT enabled . . . . . . . . . . . . .
47
4.3
Benchmarks measuring the reduction of overhead by optimizations. JIT disabled. . .
49
4.4
Total overhead reduction. JIT compiler disabled. . . . . . . . . . . . . . . . . . . . .
50
4.5
Benchmarks measuring the reduction of overhead by optimizations. JIT enabled. . .
51
4.6
Total overhead reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
51
4.7
Size overhead of different optimizations applied to SPECjvm98 . . . . . . . . . . . .
52
v
List of Figures 1
Overview of the rewriting process . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2
1.1
Sample of an accounting block analysis . . . . . . . . . . . . . . . . . . . . . . . . . .
11
1.2
Updating usage field in byte-code. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
12
2.1
UML diagram of the rewriting tool for CPU accounting . . . . . . . . . . . . . . . .
26
2.2
Rewriting at byte-code level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
28
3.1
Optimization 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
38
3.2
Optimization 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
39
3.3
Optimization 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
40
3.4
Optimization 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
41
4.1
Different configurations of SPECjvm98 (unmodified and rewritten) . . . . . . . . . .
46
4.2
Summary of the overhead for accounting . . . . . . . . . . . . . . . . . . . . . . . . .
48
4.3
Overhead of optimized versions of rewritten SPECjvm98 in percentage. JIT compiler disabled. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
49
4.4
Overhead of optimized versions of rewritten SPECjvm98 in percentage. . . . . . . .
52
5.1
Accounting CPU resource: demonstration applet. . . . . . . . . . . . . . . . . . . . .
54
5.2
Illustration of the general resource control model. . . . . . . . . . . . . . . . . . . . .
55
vi
Acknowledgements I would like to thank my supervisors, Alex Villaz´on and Walter Binder, for their valuable suggestions comments and help. Without hundreds of e-mails that we exchanged, this work would never be possible. Special thanks to Alex, who dedicated me his time and patience and who encouraged and helped me to write this thesis in English. I am grateful to my parents and family for their support and their patience during my extended absence, thank you for getting me started.
Geneva, Switzerland July, 2001
Rory Gavino VIDAL BURGOA
vii
Abstract Recently, increasing interest has been ported in the usage of mobile code technology for the implementation of distributed applications and systems. Accepting (foreign) mobile code (or mobile agents) to be executed in a given execution environment (EE), may introduce some concerns about the distribution of resources among the applications (and agents). Thus, malicious or buggy agents can, for example, disturb the correct execution of other agents or even crash the EE. To avoid this vulnerability, a model of resource control is necessary. Resource control implies accounting and limiting the consumption of computation resources in a system (CPU, memory, network bandwidth, etc.). In this thesis, we describe how it is possible to perform CPU accounting for applications written in Java. This is important for any mobile code/agent system based on Java, since Java does not provide any support for resource accounting. The main contribution of this work is the implementation of a tool that allows relative CPU accounting of Java applications without require the source code to be modified. Furthermore, our tool is completely implemented in Java, ensuring complete portability. This is also important for mobile code based systems, since the usage of native code, limits the deployment. The approach is based in the accounting of Java byte-code instructions as a relative unit for CPU account. The tool introduces the necessary byte-code instructions for accounting directly in the application byte-code, avoiding the need of the application source code. This is another important contribution for mobile code based applications. We have performed measurements on the overhead introduced by the relative CPU accounting in complex Java applications. The results show that with our approach, only a small overhead is introduced (about 5%) if the correct optimizations are chosen. Finally, we describe how the CPU accounting is used in simple examples and how the accounting information that is produced can be used in a mobile agent system for resource control purposes.
viii
Introduction Java [12] is probably one of the most successful platforms for the implementation of mobile code and mobile agent systems. It provides several interesting features, such as Object-Orientation, multithreading, strong typing, and portable executable code, called the Java byte-code. However, Java does not provide any resource control (accounting and limiting) facility, which is highly desired in the context of mobile code environments. Several mobile agent systems has been implemented 1 in Java. Some of them, provide resource control facilities, but rely on Operating System (OS) specific features, i.e. they are implemented in native code, limiting portability. In the context of mobile agent environment, it may be interesting to know how many physical resources an agent consumes (CPU, memory, network, battery, etc.) as well as logical resources (number of threads that it spawn, number of agents that an application involves, etc.). One application of resource accounting can be the introduction of a resource control mechanism allowing for example, knowing if a malicious (or buggy) agent consumes more resources than allowed. Such kind of attacks (called denial-of-service (DoS)) is difficult to prevent without the convenient resource accounting. CPU resource consumption however, is difficult to be measured without the support of the underlying Operating System (OS). In Java, this implies the usage of native code libraries, which is not a desired solution (e.g. JRes [9]). Another possible solution can be the modification of the Java run-time system (the Java Virtual Machine (JVM) [16]) to support resource control (e.g. KaffeOS [1]). This solution however, also limits the deployment of mobile code applications. The approach that is following in this thesis, allows to perform accounting, based in a relative unit: the number of byte-code instructions that an application executes. Even if this solution does not give accurate CPU consumption information, it is however sufficient enough for control of applications without real-time requirements. Furthermore, it allows providing a fully portable solution, since it does not require any native code, and can also be implemented in Java (the byte-code format, containing enough symbolic information to allow modifications directly at byte-code level) The implementation described in this thesis, is part of a more complete resource control model, described in [3], allowing to control CPU and memory resources in the JSEAL-2 mobile agent system. Our implementation shows that it is possible to obtain enough information about the CPU resource consumption of an application, before the actual execution is performed and that the accounting introduces only reasonable overhead. This allows designing a resource control model to prevent DoS attacks in mobile agent environments. The introduction of relative CPU accounting in arbitrary Java class, requires low-level modifications in the byte-code (to add the actual accounting byte-code). The basic idea is to add a new accounting object, containing a counter that is updated after the execution of a block of byte-code instructions. The tool that we have implemented, adds the necessary byte-code instructions in strategic locations in the Java byte-code in order to perform accounting, before the code is actually 1 An
(incomplete) list of different Mobile Agent Systems is listed in the ”Mobile Agent List” (MAL) by Hohl see [13]
1
2
executed. Normally, the introduction of such accounting code should not require any modification in the original application structure, i.e. the accounting code may be inserted in all method bodies. However, the updating operation for the CPU account value, may introduce enormous overhead, since each update requires a look-up for the current thread to which the account must be performed. Thus, the solution that is proposed here, relies in a modification of the application structure, by passing the accounting object directly as an additional argument for each method. This approach reduces considerably the overhead, because the look-up operation is no longer needed. Since the modifications for introducing the accounting code is performed directly at byte-code level, complex byte-code analysis is then necessary in order to know where the accounting byte code must be introduced. Furthermore, the modification of the application structure (e.g. modifications in the methods signature), requires tricky manipulations of the byte-code instructions. Finally, since the modifications must be applied to arbitrary Java byte-code classes, the solution must take into account all the possible cases (e.g. inter-class dependencies, invocation to native code, synchronization, etc.) and the tool must generate consistent and ready to execute byte-code, that may not change the original functionality of the class. Figure 1, shows an overview of how accounting byte code instrumentation is performed. The Java byte-code of arbitrary application is modified by the tool and generates a modified version of the application containing the accounting code. The modification is performed off-line, i.e. all classes of an application must be modified before their execution. However, since the tool is also implemented in pure Java, it is possible to integrate such a byte code instrumentation mechanism to a system that instruments the byte code on-line, i.e. at load-time, such as a mobile agent one. CPU aware Java application Java application CPU info Java-based Rewriting tool
The original applications’s byte-code
Modified application with CPU accounting
Figure 1: Overview of the rewriting process
The introduction of additional accounting code, depending on the application structure at bytecode level, may introduce prohibitive overhead. The integration of optimizations in the rewriting process, is then necessary. For this, we have studied and implemented simple optimization algorithms that allow reducing the accounting overhead. The tool allows specifying of one or several combination of optimizations that are plugged during the rewriting process. Furthermore, we have designed the tool in order to easily change or extend other optimization strategies. Finally, we have measured the rewriting tool as well as the overhead in complex Java benchmark applications, showing that our implementation is generic, portable, allows transparent accounting of CPU resource and introduces very acceptable overhead.
3
Related Work on Resource Control in Java Environments JRes [9] is a resource control library for Java, which takes CPU, memory, and network resource consumption into account. Accounting for CPU relies on native code and on the underlying operating system2 . Memory accounting in JRes needs also the support of a native method (to account for memory occupied by array objects). To achieve accounting of network bandwidth, the authors of JRes also resort to native code, since they swapped the standard java.net package with their own version of it. Consequently, JRes does not meet our requirements regarding portability. KaffeOS [1] is a Java run-time system, which supports the operating system abstraction of process to isolate applications or mobile agents from each other, as if they were run on their own JVM. Thanks to KaffeOS, it is possible to achieve resource control with a higher precision than what is possible with the portable techniques for resource accounting described in this thesis. The KaffeOS approach should by design result in better performance, but is however inherently non-portable. This also means that optimizations found in compilers and standard JVMs are not benefited from: The authors report that, in absence of denial-of-service attacks, IBM’s compiler and JVM [19] is 2–5 times faster than theirs. In contrast, our fully portable implementation of resource accounting in Java executes on every standard JVM and incurs only moderate overhead. NOMADS [22] is a mobile agent system, which has the ability to control resources used by agents, including protection against denial-of-service attacks. The NOMADS execution environment is based on a Java compatible VM, the Aroma VM, a copy of which is instantiated for each agent. There is no resource control model or API in NOMADS; resources are managed manually, on a per-agent basis or using a non-hierarchical notion of group. Relying on a specialized VM, it follows that NOMADS supports only a few hardware and operating system configurations. There are several lines of research, where environments and analysis tools have been designed that can be exploited more or less with the same objectives as exposed in this thesis. The Real-Time for Java Experts Group [6] has published a proposal to add real-time extensions to Java. One important focus of this work is to ensure predictable garbage collection characteristics in order to meet real-time guarantees. For instance, the specification provides for several memory management schemes, such as areas with limited lifetime or bounded allocation rates, which could be simulated with the aid of accounted memory resources. Profilers constitute another class of tools that have many aspects in common with resource control: both intend to gather information about resource usage. Profilers however are designed to help developers optimize the efficiency of their applications, and not to externally control their resource consumption. The Java Virtual Machine Profiling Interface (JVMPI) [20] is an API created by Sun; it is a set of hooks to the JVM which signals interesting events like thread start and object allocations. Java Usage Monitor (JUM) [11] is a tool which builds upon JVMPI to help the developer determining how much CPU is consumed by the different threads of an application and how much memory they use. JUM needs native code to obtain information from the underlying operating system about how CPU time is allocated, and is therefore not portable. Finally, we mention some approaches that rely on economics-based theories, using virtual currencies to achieve natural load-balancing of concurrent applications, as well as recycling of unused resources in open distributed environments, with the anticipated side-effect of preventing denial-ofservice attacks [25]. Our focus is however more on how to provide the basic resource accounting mechanisms on a specific platform, Java, than on the design of high-level – and distributed – resource allocation policies. Nevertheless, the techniques presented in here may be exploited for the implementation of open computational markets. 2 More
precisely, CPU accounting in JRes is based on native threads, a feature not supported by every JVM.
4
Document organization In Chapter 1 we explain how information about CPU resource consumption of an application can be extracted from compiled bytecode. Then, some basic definitions are presented that are used to establish the rewriting rules in order to use the extracted information. Chapter 2 describes the implementation of the rewriting tool as well as the different phases of the bytecode analysis required for the rewriting process. A description of Bytecode Engineering Library (BCEL) is presented. Since the rewriting tool is based on BCEL framework for low-level bytecode manipulations. Later on, we show the design of the different components of the tool as well as a detailed description of the rewriting process. This is the main part of this thesis because it explains the actual tool implementation. Chapter 3 is devoted to optimizations that can be applied with the aim of reducing the introduced accounting overhead. We explain four optimization algorithms as well as their implementations. Different evaluations of rewritten applications are presented in Chapter 4 as well as evaluations of different optimization strategies. Chapter 5 concerns the applications of the CPU accounting. A simple example showing how resource accounting information can be used. A description of how the rewritten tool can be integrated in a mobile agent system is presented and we discuss other kind of applications of portable resource accounting. Finally, Chapter 6 present the general and personal conclusions and we discuss future works on resource accounting and resource control based on the tool that has been implemented. An explanation of how to execute the rewriting tool with many examples is provided as Appendix.
Chapter 1
Extracting CPU resource information This chapter gives a description of the process that allows the introduction of ”accounting code” in Java applications. It also describes the rules that are used to extract CPU resource information from Java byte-code and how the application is instrumented.These rules are applied to all the classes that compose an application before their execution. We define the rewriting rules for CPU resource instrumentation in order to allow transparent and portable CPU accounting. We discuss the rewriting of both shared classes, i.e. classes that are loaded only once in the JVM by the system class loader and replicated classes i.e. classes that can be loaded several times by other class loaders (typically applications loaded by customized class-loaders). A new object (called CPUAccount) is added that represents the accounted resource. This object contains a counter that must be updated when the code is executed. Basically, the modifications will be done in the method body in order to render the method aware about the resource consumption i.e. some additional instructions must be added that perform the updating of the consumption information. These instructions are inserted at strategic locations in the method body, in order to (a) perform accounting before the instructions are executed, and (b) to reduce the overhead introduced by the accounting itself. The inserted instructions will update the counter on the corresponding resource object in order to reflect the consumption of CPU resource in the application. The introduction of the new accounting object also implies changes in the original structure of the application. These changes are done in order to guarantee complete transparency, i.e. the resulting application must behave exactly as if the accounting code were not present at all. This chapter is organized as follows: Section 1.1 gives an overview of how it is possible to manipulate Java byte-code instructions. It describes how byte-code class files are structured and gives a rapid introduction to how byte-code instructions are executed. Section 1.2 describes the basic definitions required to introduce accounting code, on which the tool is based on. The notion of accounting block in the byte code is introduced as well as a description of the accounting object itself. In Section 1.3 the main rules to be applied in the rewriting process are presented. Each case is carefully detailed.
5
6
1.1
Byte-code instrumentation
The basic idea for the implementation of resource accounting in Java based applications is based in byte-code rewriting techniques. There are several reasons for using byte-code rewriting. The first, is that this technique guaranties portability, since no native code is used (the resulting code is also Java byte-code). Furthermore, rewriting can be implemented entirely in Java by using a Java-based byte-code engineering framework. The other reason for using byte-code rewriting is that it allows resource accounting without resorting to source code of the application. This is a major advantage allowing modifying of already compiled applications. Finally, even if source code modification can also be used to implement the accounting of resource consumption, it requires very complex analysis of the source code in order to obtain the same result of a solution based on byte-code analysis, since for example the relative measurement unit for CPU resource consumption is based in the number of byte-code instructions. In this section, a detailed description of the actual manipulation of the byte-code instructions following those rules is given. A small introduction to byte-code engineering is also provided in order to easier the understanding of the rewriting tool implementation that has been developed for the rewriting of Java applications and JDK classes, for the insertion of CPU accounting instructions, as well as the kind of manipulations that are performed to modify method invocations.
1.1.1
Byte-code engineering
A valid Java compiler compiles a Java application source code (a .java text file) which generates an ”executable” program that can be executed (interpreted) by a Java Virtual Machine (JVM). This executable code (called Java classfile) contains a hardware independent code (called the byte-code) that can be loaded and executed in any JVM implementation. The Java byte-code instructions are included in a classfile (the .class file) that also contains the symbolic information of the class. The Java classfile and the byte-code instructions for the Java Virtual Machine (JVM) follows a well defined specification that is described in [16]. Normally, if the developer wants to modify the application, this operation must be done at Java source code level, i.e. the application must be recompiled in order to produce a new classfile. Recently an increasing interest has been ported to perform modifications of already compiled applications (avoiding the necessity of accessing or modifying the source code). The main reason is because in general, application source code is not always available (e.g. in commercial software products, the source code is generally not available for current customers) and allowing transformations directly in compiled code can be very useful to adapt the application to particular requirements. This statement is even stronger in the context of mobile code environments. The reason is that the code that is dynamically pulled/pushed in the execution environment is executable code. Thus, any modification in the already executing mobile code application, requires some means for stopping, updating, and restoring its execution. The rational of byte-code engineering is then to allow the manipulation of already compiled Java applications, and modify directly the byte-code in order to produce loadable and executable byte-code without resorting to any external source compiler or source code. In Java this is possible since the .class files contain enough symbolic information to allow direct instrumentation of byte-code. The other interest of byte-code engineering is to ensure portability of code, since all modifications are done at byte-code level. Furthermore, byte-code engineering is also possible without resorting to any hardware or OS dependent application, i.e. all byte-code manipulations can be done using a Java program. There are several kinds of applications where byte-code engineering can be applied, such as:
7
• Optimization: To correct inefficient byte-code generated by a compiler. Such technique is used for example in Just In Time (JIT) [19] compilers that are integrated in the JVM, or can be performed off-line. It is also possible to remove unnecessary information (used only for debugging). • Security: Obfuscation of applications to avoid “black-box testing” attacks (deducing what an application do by running it several times for well-chosen inputs and collecting the results) [2]. Another example is adding supplementary security checks in the application to verify authorized access to applications [27]. Restriction of the usage of particular methods, packages or classes that are removed from an application [3] • Adaptation: Modify the structure of commercial off-the-shelf (COTS) software components without requiring source code, for visualization, instrumentation of applications [28]. Dynamic creation of proxy objects to simplify the remote communication between objects [17, 21]. • Strong migration: In the context of mobile agents, byte-code rewriting techniques have been used to allow strong migration (i.e. the migration of the state of a running agent), by inserting the necessary instruction that allows to save the running state of the agent directly at bytecode-level [18, 24] • Extensions to Java: Compile-time byte-code transformation to extend Java with parameterized classes [5] To better understand how byte-code instrumentation of code is performed for accounting of resources, a description of the structure of a class file is given, which contains some notions that will be useful for the presentation of the concrete implementation of a rewriting tool that will be described in the next section. Structure of a Java class file A Java class file is structured as follows: It starts with a header that contains a “magic number” (0xCAFEBABE) and the version number, followed by the constant pool that can be roughly thought of as the text segment of an executable (coded as string constants), the access rights of the class encoded by a bit mask, a list of interfaces implemented by the class, lists containing the fields and methods of the class, and finally the class attributes. Attributes can be used to put additional information into the class file data structure, which can be exploited by a customized class loader. User-defined attributes must be however ignored by any virtual machine implementation. All the information needed to dynamically resolve the symbolic references to classes, fields and methods at run-time is found in the constant pool as string constants (i.e. that the largest portion of the class file, about 60 percent, is formed by the constant pool and the byte-code instructions themselves are only about 10 percent). The byte-code instruction set JVM is a stack-oriented interpreter that creates a local stack frame of fixed size for every method invocation. Values may be stored in a frame area containing local variables (of a fixed size too) which can be used like a set of registers. The stack frames of the caller and the called method are overlapping, i.e. the caller pushes arguments onto the operand stack and the called method receives them in local variables. The Java byte-code instructions (detailed in [16]) are grouped as follows: Constant operations: Constants can be pushed onto the stack either by loading them from the constant pool with the ldc (load constant) instruction or with special “short-cut” instruction where the operand is encoded into the instructions e.g. iconst 0 or bipush (push byte value).
8
Arithmetic operations: The operand type is distinguished by using different instructions for each value of a different type. Arithmetic operations starting with i, for example, denote an integer operation. For example, iadd adds two integers that are in the stack and pushes the result back on the stack. Control flow: There are unconditional branch instructions like goto as well as conditional branch instructions, like if icmpeq (that compares two integers for equality). There is also jsr (jump sub-routine) and ret that are used to implement the finally clause of try-catch blocks. Exceptions are thrown with the athrown instruction. Branch targets are coded as offsets from the current byte-code position. There are too, return instructions that stop the method execution and returns a results if specified, there are a different return instruction for each JVM type (Reference, integer, long, etc.). Load and store operations: There are instructions to load and to store instructions and values from and to local variables like iload and istore for integer values. These instructions loads values from a local variable to the stack and store them from the stack to a local variable. Field access: The values of an instance field may be retrieved with getfield and written with putfield instructions. There are also getstatic and putstatic for static fields. Method invocation: Four kinds of invocations are available: invokevirtual for normal method invocation; invokestatic for calling method via static references; invokespecial for invocations requiring a particular handling such as constructors, a private method or a super class one; and finally invokeinterface to invoke a method that is implemented by an interface. Method code Non-abstract methods (i.e. methods with a body) contain an attribute (Code) that holds the following data: The maximum size of the method’s stack frame, the number of local variables and a containing the byte-code instructions. There are also optional values e.g. the names of local variables or the line number in the source code, which can be used by debuggers. The list of byte-code instructions contains also the instructions of exception handlers. Whenever an exception is thrown, the JVM performs exception handling by looking into a table of exception handlers. The table contains information about the scope of each exception handler. Thus, the code of the exception handler responsible for a certain type of exception that is raised within a given area of the byte-code, will be executed. When there is no appropriated handler, the exception is propagated back to the caller of the method. The handler information is itself stored in an attribute contained in the Code attribute. The exception table contains then from, to, target and type information for each exception. Byte-code offsets Target of branch instructions are encoded as relative offsets in the array of byte-codes. Exception handlers and local variables refer to absolute addresses within the byte-code. The former contains references to the start and the end of the try block, and to the instruction handler code. The later marks the scope of the variable, i.e. the range in which a local variable is valid. Type information - Signatures Since Java is a type-safe language, type information is very important to ensure a coherent typing of the applications. The information about types of files, local variables, and methods is stored in signatures. Signatures are strings stored in the constant pool and encoded in a special format. For example the method descriptor for the method
9
Object mymethod(int i, double d, Thread t) is (IDLjava/lang/Thread;)Ljava/lang/Object; that contains the parameter descriptor (between the parenthesis) as well as the return descriptor. Classes and arrays are internally represented by strings (e.g. “java/lang/Object”) and basic types such as int and double by an integer number that is replaced in the signature by a single character e.g. “I” for integer and “D” for double. “L;” meaning a reference to an instance of the class classname. Other basic types are “B” for byte, “C” for character,”Z” for boolean, etc. For return values, the character “V” indicates that the method returns no value (i.e. it returns void).
1.2
Basic accounting definitions
In this section we introduce the basic definitions of ”accounting block” and ”accounting object” that are used for the rewriting process.
1.2.1
Accounting blocks
Since the relative measurement unit for CPU accounting is the number of executed byte-code instructions, it is necessary to define in which places the accounting instructions must be inserted. The idea is to define accounting blocks that can be constructed based on the information provided by the byte-code instructions themselves (i.e. the construction of the accounting blocks must be done only using static information that can be obtained after an analysis from the application code at byte-code level). The accounting block is closely related to the notion of basic block of code. In each accounting block it is possible to add accounting instructions. In order to reduce the overhead that will be introduced by the accounting instructions inserted in the code, the accounting blocks must have a maximal length. The length of the accounting blocks represent the number of byte-code instructions that will be executed, thus, it gives an approximation of the CPU consumption when the accounting block is executed. By inserting the accounting instructions at the beginning of the accounting block, it is then possible to know the relative consumption of CPU before it actually happens. Then, such information can be used to perform a fine-grained control of the application. Let us now define the notion of accounting block: An accounting block is a sequence of byte-code instructions that fulfills the following constraints: • An instruction that changes the control-flow non-sequentially must be the last instruction of an accounting block. Such instructions are: for example unconditional branches (e.g. goto, jsr, return, ret), conditional branches (if), exceptions raising (athrown), etc. • Method, constructors (invokevirtual,invokespecial, invokeinterface or invokestatic) do not terminate an accounting block. The reason of not considering those cases for an accounting block termination are because this would increase considerably the overhead of accounting since such invocations are extremely frequent, and because the accounting inside the invoked methods and constructors is performed any way. • All branches must point to the beginning of a block. There is no byte-code instruction that branches to an instruction that is not the beginning of its block. This means that when basic blocks are constructed, if a branching instruction is encountered and its target is not the beginning of a block, the block must be split and the target instruction becomes the first
10
instruction of the new block. The target instructions of exception handlers must also be in the beginning of a block. An accounting block analysis following the constraints described before partition the method code in a set of accounting blocks. Each block contains the length of the block (the number of byte-code instructions). Using these accounting blocks, a control-flow graph (CFG) is constructed. This graph allows minimizing the accounting overhead by reducing the number of accounting blocks, and thus the number of total accounting instructions to be inserted. The CFG will be used to detect situations where the accounting blocks can be combined and then reduce the number of accounting updates that are inserted in the code (i.e. the blocks must be as bigger as possible). The insertion of accounting instructions will be done at the beginning of every block (with or without any optimization). Notice that the accounting block size represents a weight of 1 for each byte-code instruction, however the approach does not limit such kind of measurement and it is possible to give a different weight to each byte-code instruction depending on a particular policy or the level of accuracy that is required. This requires of course a finer knowledge about the implementation of the JVM and the system administrator is able to provide such information for example in a configuration file, at system start-up. In Figure 1.1 a fictitious sample shows the construction of the accounting blocks. The sample shows the byte-code instructions of a method, where each instruction is encoded with an offset number (this example is only shown to illustrate the creation of accounting blocks, more details on byte-code instrumentation will be provided in the following section). One can see that method invocations are not considered to form an accounting block (see that the invoke instructions at offset 19 does not induce a block split). Forward branches (e.g. goto instruction at offset 7 with target at offset 25) induce the creation of block 4, which normally should form a single block together with block 3 (see the bold lines between blocks). Similar situation can be found with backward branches, where blocks are split (e.g. if branch at offset 32). The exception table contains the range for each exception, i.e. its scope, coded by (from-to) and the target instruction for the exception handler. The target instruction in this case, is the beginning of an accounting block, so there is no need to split the block. The figure shows also the resulting graph that is generated and that will be used to insert the accounting instructions, normally each node in the graph only requires to store a reference to the starting instruction and the length of the associated block.
1.2.2
Accounting object
Once the sufficient information for the insertion of accounting is known, we have to define the way that the actual accounting is performed. For this, we define an accounting object that will be used to store the consumption information, and then serve as the basis for the implementation of the control algorithm. The accounting object has a counter field (usage) that must be updated when the application code is executed. The CPUAccount class is depicted in Table 1.1. The update of the accounting is done by incrementing the usage field in the CPUAccount object. In other words, the updating instructions that must be inserted at the beginning of each accounting block, corresponds to (a) loading the usage field of the CPUAccount object from memory (it is declared as volatile), (b) updating the current value by adding the block size and (c) storing the new value in the memory. This operation corresponds to the following instructions (in source code): “usage += size(block) + size(update);”. The corresponding byte-code instructions are depicted in Figure 1.2 (notice the post-fixed notation of the byte-code instructions). In this example, we suppose that the block size is 4, then the actual value to be updated will be 10 (since the update itself uses 6 byte-code instructions). We also suppose that the CPUAccount object is associated the local
11
1 2
3
4 5 6 7 8
Method void doit(java.lang.String[]) 0 bipush 10 2 istore_1 3 iconst_1 6 istore_3 7 goto 25 10 invokestatic #2 13 pop 14 ldc2_w #3 17 bipush 100 19 invokestatic #5 22 iinc 3 1 25 iload_3 3 {first=10,size=6} 26 iconst_1 27 if_icmple 10 30 iload_2 31 iconst_1 from exception 32 if_icmpl1 3 7 {first=41,size=3} 35 iinc 2 -1 38 goto 44 41 astore_2 42 iconst_0 43 istore_3 44 return Exception table: from to target type 3 38 41
1
{first=0, size=2}
2
{first=3,size=3}
4
{first=25,size=3}
5
{first=30,size=3}
6
{first=35,size=2}
8
{first=44,size=1}
CFG
Figure 1.1: Sample of an accounting block analysis public final class CPUAccount { public volatile int usage; // stores CPU relative consumption // returns the CPUAccount associated to the current thread public static CPUAccount getOrCreate() { ... } ... }
Table 1.1: The CPUAccount implementation. variable number 1. Thus, it will be necessary to push the reference to the accounting object in the stack in order to process to the actual value updating. The update process is as follows: At offset 4, a reference to the accounting object is pushed in the stack (aload), then this reference is duplicated on the stack at offset 5 (dup). This duplicated reference is then used to retrieve the usage field since the getfield instruction consumes the reference on the top of the stack. Thus, after the execution of this instruction at offset 6, the stack will contain, the original reference of the accounting object and the reference to the usage field on top of the stack. The bipush instruction at offset 9 is used to push the value to be updated on the stack, and the iadd instruction will add the two integer values on top of the stack, consuming those values and putting the result on top of the stack. After the execution of these instructions, only two values are on the stack, the original accounting object reference, and the result of the addition. Finally, the putfield instruction at offset 12 will update the usage field on the accounting object and consume these two values on top of the stack. This shows that the updating instructions do not affect any value on the stack of the current execution outside the updating process itself, and then can be inserted in the locations specified after the accounting block analysis.
12
4 5 6 9 11 12
aload 1 dup getfield bipush 10 iadd putfield
// load local variable 1 in stack (the CPUAccount object) // duplicates the value on stack CPUAccount::int usage // load usage field value // put the amount to add // add values on stack CPUAccount::int usage // store the updated value
Figure 1.2: Updating usage field in byte-code.
1.3
Rewriting rules
This section explains the rules that will be applied to insert the accounting code. This explanation will focus on method and invocation modifications, on special cases such as native and abstract methods as well as the static initializers and constructors for both JDK and normal classes. An algorithm is also proposed to solve the problems concerning the native methods.
1.3.1
Method modifications
The update of CPUAccount object requires that the accounting object can be accessed inside the method that is accounted i.e. it must be a local variable of the method or must be obtained from other object. For the former case, the basic idea is to rewrite methods and constructors, in order to pass the CPUAccount object as extra argument. Thus, also invocations to the modified methods must be rewritten inside the method body of the caller classes. The accounting object is then passed as the last argument, thus it can be pushed onto the stack immediately before the rewritten method/constructor invocation (i.e. this eases the rewriting of method invocations with the additional account object as argument). The accounting object will then be added as a local variable of the method and can be used for the update. For the cases where the accounting object is not passed as additional argument, the CPUAccount class provides a method that allows obtaining the accounting object directly (see the static method getOrCreate in Table 1.1). This method returns the CPUAccount object for which the usage value must be updated, i.e. it returns the accounting object that is associated to the current running thread or creates a new one and associates it to the current thread if no accounting object already exists. Avoiding rewriting of method invocations Even if the actual rewriting of methods is based on adding an additional accounting object, let us also consider a solution fully based on getOrCreate method, consisting in adding accounting code in methods/constructors bodies without modifying neither the signature nor the invocations to methods/constructors inside the bodies. The method body should be rewritten to first retrieve the CPUAccount object (using getOrCreate), and then perform accounting using the usage value. In this approach the accounting block analysis is exactly the same as described before and a sample of such approach is shown in Table 1.21 One can see that the original instructions of the method are not modified, but only the accounting update instructions are inserted in between (based in the accounting block analysis). This approach has the advantage that no invocation in the method body has to be rewritten since the method signature stays compatible with the original invocation from any caller that is not aware about 1 For simplicity, the transformations are shown at Java source level, but the actual transformations are performed at byte-code level. Details of byte-code transformations will be given in the following section.
13
// original void aMethod() { ... // the method body ... // instructions ... }
// accounted void aMethod() { // get the CPUAccount object CPUAccount cpu = CPUAccount.getOrCreate(); // update a basic account block cpu.usage += 12; // size(blockx ) + size(update) ... // original method body ... // instructions of blockx // update for another account block cpu.usage +=16; // size(blocky ) + size(update) ; ... // instructions of blocky }
Table 1.2: CPU accounting in a method. the accounting object (such as native method call-backs that are discussed in the following) and of course all the rewritten application will be externally seen as “unchanged”. Since in this approach, it is not necessary to modify any method invocation in the method body, all rewritten methods must call the getOrCreate method as the first instruction to retrieve the accounting object, then simplifying the rewriting process. However, its major drawback concerns the overhead introduced by the static method getOrCreate in CPUAccount, since this method implements the look-up for the accounting object associated to a thread (that is a costly operation). More precisely, the reason is because the association between the accounting object and a Thread is done using thread-local variable (instance of java.lang.ThreadLocal). Thread-local variables are variables that are only visible by the executing thread, so when the thread is executing the code, it is the thread itself that access the accounting object. The problem is that accessing to the accounting objects through thread-local variables would introduce significant performance penalty, since the implementation of thread-local variables is based on hash-maps in Sun JDK 1.3 i.e. each access to the thread-local variable requires a hash-map look-up. The other inconvenience in this approach is the gathering of the accounting object, which is performed in all methods, thus loosing the reference of the already retrieved accounting object (that can be passed as an additional argument). Thus, the approach fully based in getOrCreate method must not be considered but we will see that in some cases it will be necessary to avoid rewriting some invocations in the method body, and it will be also necessary to use getOrCreate. Duplication of method body Another solution can be implemented by adding a new resource aware version for each method, i.e. having two implementations for each method. The original method is rewritten in order to retrieve the accounting object (using the getOrCreate) and then perform accounting in the method body. All the method invocations inside the body are rewritten to pass the additional accounting argument already available. A new resource aware version is added to all target classes (since we have modified the original method invocations). This new version contains a copy of the original body that has also been instrumented for accounting, exactly as the original method (see Table 1.3). Even if the overhead introduced by the look-up of the current CPUAccount object has been reduced with this approach, one can see that the size of the class is doubled because all method bodies are copied to construct the new method with the accounting object.
14
// original public class A { void m() { B b = new B(); b.mm(); } } public class B { void mm(){ ... ... } }
// accounted public class A { void m() { // original signature CPUAccount cpu = CPUAccount.getOrCreate(); cpu.usage+=10; // accounting B b = new B(); b.mm(cpu); // rewritten invocations } void m(CPUAccount cpu) { // resource aware cpu.usage+=10; // same rewritten body B b = new B(); b.mm(cpu); } } public class B { void mm() { CPUAccount cpu = CPUAccount.getOrCreate(); cpu.usage+=14; ... cpu.usage+=10; ... } void mm(CPUAccount cpu) { cpu.usage+=14; ... cpu.usage+=10; ... } }
Table 1.3: Rewriting two versions for each method Redirection using method wrappers. To solve the size problem of the precedent approach, the body of the original method is replaced by a redirection to the new method with the additional account object that contains the original method body extended with the accounting updates. In other words, we use method wrappers: the new method with the additional account object is wrapped by the original method (see Table 1.4). In fact the original method signature no longer contains the method body, which is this time fully implemented in the method containing the additional accounting object. Such approach provides an interesting solution that can be applied almost in all kind of methods and it is at the base of the actual implementation of the rewriting. Some small adaptations are however required in some cases. One particular case that requires particular attention concerns native methods, i.e. methods that are not implemented in Java, but in hardware and Operating System (OS) dependent code. The difficulty with such implementations is that they require modifications that cannot be implemented in a portable way, and also require modification at source code level. We describe in the following how to cope to native method problems.
15
// original public class A { void m() { B b = new B(); b.mm(); } } public class B { void mm(){ ... ... } }
// accounted public class A { void m() { // becomes a wrapper m(CPUAccount.getOrCreate()); } void m(CPUAccount cpu) { // resource aware cpu.usage+=10; B b = new B(); // original accounted body b.mm(cpu); } } public class B { void mm() { mm(CPUAccount.getOrCreate()); } void mm(CPUAccount cpu) { cpu.usage+=14; ... // original accounted body cpu.usage+=10; ... }
Table 1.4: Solving size overhead using wrappers.
1.3.2
Native Methods and JDK rewriting.
Java gives the possibility to implement some methods in native code, i.e. implemented in an architecture and Operating System (OS) dependent programming language (typically C). The Java Native Interface (JNI) allows implementing of native methods that can be called by Java programs (as normal methods). The native method implementation can also manipulate Java objects (directly from the C program). In general the implementation of native methods is not considered as a good practice since the portability of code is no longer guarantied, this argument is also important in the context of mobile code environments, where in general native code implementations must be avoided (portability is one of the major arguments for the success of using Java to implement mobile agent frameworks). On the other hand, implementing native methods allows increasing of performances in the applications since native code runs much faster than interpreted Java code, and of course to give access to some OS primitives that cannot implemented in Java. However, as stated before, using native method is a practice that must be reduced as possible (as an example, only 3 percent of methods in JDK implementations use native methods). Native methods introduce some difficulties to rewriting since native methods implementations are not aware about the additional accounting object that is passed as parameter and of course it is not possible to modify native libraries to add accounting information. One problem comes in the case of callbacks from native methods i.e. the method implementation in C calls a Java method that has no accounting object. Three scenarios are of particular interest: • Thread creation: The Java runtime system (implemented in native code) calls the run method of a thread object, when a thread is started with the start method. • Static initializers: Static initializers that are used to initialize class variables of a class are directly called during the class-loading process and are invoked by native code (static initializers are similar to object constructors, but for classes).
16
• Reflection: Native code is used in object creation using reflection through newInstance method in java.lang.reflect.Contructor and java.lang.Class or explicit method invocation using invoke method in java.lang.reflect.Method. To solve the native method callbacks, wrappering approach described before (see Section 3) can be applied since the native code will call the wrapper method that has the signature of the original method. The difficulty is to know which are the native methods that have those callbacks. So, a simple solution is to apply approach wrappering to all the rewritten classes. Another difficulty with native methods comes with the rewriting of native methods themselves. Since native methods are implemented in C, i.e. they have no method body in Java byte-code, so it is necessary to deal with rewritten invocations that calls native methods (in order to avoid rewriting it with the additional accounting object). It is then necessary to know if the target of a method invocation corresponds to a native one. Unfortunately, this information is not available in the caller class, since the caller is not aware about the nature of the called method (native methods are invoked as normal methods). Thus, two different solutions can be applied: (a) Wrappering all native methods A first solution consists in creating some kind of wrapper method for each native method. The method signature of the native method is reproduced to create a wrapper method with the accounting object as additional argument. In this new wrapper method the native modifier is removed and its method body simply invokes the original native method in the same class and drops the accounts object (see Table. 1.5). This ensures that all callbacks to native methods will be valid, and all the invocations that where rewritten in other methods (with the accounting object), will also are able to call the native method. // accounted native void m(args); // original native void m(args);
void m(args,CPUAccount cpu) { // drops accounting m(args); // invoke native }
Table 1.5: Wrappering native methods This first approach for solving native calls rewrite has however a major restriction, it is based on the assumption that the native code does not use any information concerning the caller of the method. Unfortunately, this assumption is too strong since, some native methods can use information about the caller, for example some native methods implementations in JDK use information about the caller to perform some access checks internally, thus any modification about in the call sequence will create an access check error (this problem can be found in JVM internal implementations). It is even possible that some methods use information about the caller of the caller, which is not longer consistent after the kind of transformations that are described here. The reason is because the wrappering of the method invocation modifies the actual caller to a native method. This can be seen in Table 1.5 since the object itself becomes the caller of the native method rather than the actual caller. To solve such problem it will be necessary to consider a structural analysis of the application that is rewritten in order to know which methods must be avoided from wrappering and consequently, which invocations must not be rewritten. Notice however that the problem of using caller information in native methods is more related to JDK rewriting, and in general applications do not depend on low-level JVM internals.
17
(b) Using structural analysis The second approach to solve native code invocation problem is based on the analysis of the structure of all classes, before they are rewritten. The basic idea is then to know which methods invocations must not be rewritten, i.e. which methods does not require a wrapper. In the simplest case, this means that for each method invocation encountered in a method body, we have to know if the target method is native or not as can be seen in Table 1.6, then it will be sufficient to analyze all the target classes to check if the invocation refers to a native method or not. Such analysis can be implemented in a two phases algorithm that analyze all classes that are used in the application in order to create a list of methods for which the invocations must not be rewritten. The main disadvantage of this approach will be the need of the analysis in all the application code and cannot be done only using local class information.
// original public class A { void m() { B b = new B(); b.mm(); } } public class B { native void mm(); }
// accounted public class A { void m() { m(CPUAccount.getOrCreate()); } void m(CPUAccount cpu) { cpu.usage+=10; B b = new B(); b.mm(); // call the original native } } public class B { native void mm(); // unmodified }
Table 1.6: Second approach for native methods: Using structure analysis In fact structural analysis can be used to solve other problems related to the inheritance and the modifications performed in the structure of the application that are introduced for example by the introduction of the new resource aware methods (see the discussion about abstract methods in the following). The goal will be to know which methods must be avoided from wrappering, i.e. which methods must be called directly without any kind of delegation. Abstract Methods Abstract methods, i.e. methods that are implemented in subclasses do not have a method body. This implies that if a new method is added (e.g. in the case of wrappering), this method will also be abstract, and thus all subclasses must also implement the added method (see Table 1.7). Similar rewritten schema can be applied to interface methods (that have no body as well) and their implementation must be done in the classes that implement the interface (all methods in interfaces are abstract). A structural analysis allows solving the problem of abstract and native methods. This analysis can be for example applied to the whole JDK. The following two passes algorithm allows creating a list of methods that must not be called through a wrapper.
18
// accounted abstract public class A { abstract void m(); // must be implemented in subclasses abstract void m(CPUAccount cpu); }
// original abstract public class A { abstract void m(); } public class B extends A { void m() { ... } } public class C extends B { void m() { ... super.m(); } }
public class B extends A { void m() { m(CPUAccount.getOrCreate()); } void m(CPUAccount cpu) { cpu.usage += 12; ... } } public class C extends B { void m() { m(CPUAccount.getOrCreate()); } void m(CPUAccount cpu) { cpu.usage += 18; ... super.m(cpu); // invokes the rewritten } }
Table 1.7: Rewriting of abstract methods Algorithm to avoid native method wrappering in JDK Definitions: - For each class X let super(X) be the set of (direct and indirect) super classes of X; i.e. super(X) = {Y | X extends+ Y }, where “extends+ ” is the transitive closure of the extension relation; - For each class X let interf (X) be the set of interfaces (directly or indirectly) implemented by X; i.e. interf (X) = {Y | X implements+ Y }, where “implements+ ” is the transitive closure of the implementation relation; - For each class X let base(X) be the union of super(X) and interf (X). - For each class/interface X let methods(X) denote the methods defined/declared in X. - For each class X let natives(X) denote the native methods defined in X. - For each class/interface X let abstracts(X) denote the abstract methods declared in X.
The first pass: Scan the whole JDK and other libraries that may use native code, then build a complete graph including the following information: inheritance, implemented interfaces, methods, special tags for (abstract and) native methods. Each method has a tag hasWrapper indicating whether it receives a wrapper or not. Initialization:
19
for each class X for each method M in methods(X) M.hasWrapper = true
Marking algorithm: for each class X for each method N in natives(X) N.hasWrapper = false for each class/interface Y in base(X) if Y defines/declares a method M with the same signature as N M.hasWrapper = false
The first pass creates the list of methods that do not get a wrapper. The marked methods are of three different natures: they are native (by definition they are marked), they can be abstract (so without any implementation) or they can be non-abstract methods (normal methods with an implementation). The first pass creates the list containing all methods that have a native redefinition of the method in a subclass of the class where it is defined/declared.
The second pass: Using the list generated by the first pass For each method in the list, do not add methods with additional accounting argument, i.e. no wrapper is needed. By definition, all native methods will not be rewritten and no wrapper will be created for natives. For abstract methods in the list, no prototype declaration is added of the additional method, since a new abstract method should imply the insertion of concrete implementation in all subclasses, and since one subclass redefines the method as native, such method must not be added. For non-abstract methods implemented in Java, the following code must be generated: m(args) { CPUAccount cpu = CPUAccount.getOrCreate(); ... method code with accounting instructions } Those methods are not redefining a native method, but they are themselves redefined by a native method in some subclass. Thus, they must not get a wrapper and then access to the accounting object through the getOrCreate method (as the solution depicted in Table 1.2). For each method invocation, it will be necessary to check if the target method is in the list. In this case, no additional argument must be added.
To summarize, this algorithm ensures that • native methods do not get wrappers • Polymorph call-sites correctly invoke native methods since a definition/declaration in the super class/interface must not offer a wrapper that is not available in the subclass. All subclasses of the class C containing the last native method m redefinition in the hierarchy may have wrappers. However, no subclass of the class C may redefines m as native.
20
Other particular cases for JDK rewriting In the case of JDK rewriting, there are only some particular methods that do not receive any “accounting code”. This is the case of Object class constructors, methods belonging to classes related to the accounting and also some static initializers. Constructor of the Object class: As it was described above (in Section 1.3.1), constructors also receive ”accounting code”, but this is not the case for the java.lang.Object class. This class is the root of all the hierarchical tree structure of JDK. Any class must have at least a constructor (at byte code level), the constructors must start with the invocation of a constructor of its super-class, except in the case of Object which does not have a super-class, the body of its constructors must be empty. Static Initializers: Since accounting classes (e.g. CPUAccount, AccountReference) are implemented using JDK classes, in some cases, rewriting static initializers of JDK classes can generate a circular dependence problem, incurring in an initialization error. This problem can be found for example while initializing a JDK class that contains a reference to an accounting object, before this one has been initialized by the VM (e.g. class Thread). In this particular case, the rewriting of static initializers must be avoided. Abstract Method Error: Consider a class C and the sets super(C) set of (direct and indirect) superclasses of C. interf (C) set of interfaces (directly or indirectly) implemented by C. Let be S ∈ super(C) and I ∈ interf (C). Then, it is possible to use invocation INVOKEINTERFACE to call a method (e.g. void m()) which is implemented in S and which is declared in I. An error of the type ”AbstractMethodError” can occur if S implements a method as native (native void m()), since the JDK rewriting algorithm (see Section 1.3.2) does not change method signature with a new argument for the native methods, but it changes them for the abstract methods of the interfaces (e.g. void m(CPUAccount cpu)). The called abstract method, which does not have implementation, is the method with a new signature, this one is not implemented as ”wrapper” because does not exist in C. To avoid such situation, new “wrapper” methods are added in a subclass of S for the native methods of S. These ”wrappers” invoke the super method with the same signature as the native method (e.g. super.m()). When the class is rewritten the new method will be taken into account and the rewriting algorithm will create a ”wrapper” for this one, because this method is not native. Thus, for the calls of the type INVOKEINTERFACE as described above, there will be always an implementation in a super-class. Constructors Constructors are also rewritten as normal methods, i.e. a new constructor is added containing the additional accounting object. In Java, constructors invoke constructor in its super class (there is always an invocation to super()). If no instructions are available for a constructor, the call to super class constructor is automatically added by the compiler. Invocation to super() in constructors must also be the first instruction in a constructor body. Thus, when rewriting constructors, the accounting code must be added after the invocation to super() as can be seen in Table 1.8. The original constructor body is replaced by a redirection to a new added constructor that has the original constructor body with the accounting code, i.e. the constructors are wrapper-ed. Thus the constructor will return the object allocated by the new constructor with the additional accounting object. This is done by the invocation to the constructor using a reference to this. For JDK rewriting, special attention has to be taken for rewriting constructor of java.lang.Object class, since this class is the root in the hierarchy of classes and has no super class (i.e. its constructor has no body).
21
// original class B extends C { B() { super(); ... ... } }
// accounted class B extends C { B() { this.B(CPUAccount.getOrCreate()); } B(CPUAccount cpu) { // cpu.usage += size(block) not allowed super(cpu); cpu.usage += 10; ... ... } }
Table 1.8: Rewriting constructors
1.4
Summary
In this section, it is possible to find all the instruments, definitions and algorithms, which contributed to the extraction of information from Java applications. These instruments permit to design a tool to rewrite the applications; its implementation will have to be in conformity with the rules and definitions, which were exposed in this chapter.
1.4.1
Synopsis of rewriting rules
The rewriting process can be summarized as follows: • All the methods with an implementation (i.e. a body) must be rewritten. Two new methods are created and the original method is replaced. The first created method is a wrapper with the original method signature, and the second (the wrapped) contains modified method body with the accounting code. • The two generated methods must have the same exceptions, access flags, return type and name. The wrapped method will extend the signature, to add an argument of CPUAccount type, and the wrapper will maintain the original signature to replace the original method. • The original instruction list will be transferred to the wrapped method. These instructions are used to construct the control flow graph (CFG) for the method. Each node in the CFG represents an accounting block. • Accounting blocks are used to know where to insert the accounting instructions. Accounting instructions updated the “usage” field of the CPUAccount object. These instructions must be inserted before the first instruction of each accounting block. • Invocations to methods are rewritten, to call the wrapped methods (i.e. those with the additional accounting argument). This implies that it is necessary to add an instruction to load the reference of the accounting object before the new invocation. • The wrapper methods use the CPUAccount.getOrCreate() method to obtain the current ”accounting object”, then arguments references are pushed on the stack, as well as the ”accounting object” reference. This is used to invoke the wrapped method (that receives original arguments as well as the accounting object). The correct invocation instruction is used according to the target method ”access flags”.
22
• For the methods without body (e.g. abstract or interface methods), two new methods are created without body. The ”access flags” and exceptions are copied without modifications. • For the special case of native methods (which have a non-bytecode implementation), a search algorithm is used to find all native methods, as well as all methods that the native methods override. This algorithm explores all the class hierarchy. To maintain the coherence of classes dependencies, the methods found by the algorithm will not be rewritten, excepted those that have a method body that needs to use CPUAccount.getOrCreate() method. • Constructors will be rewritten as normal methods (i.e. with accounting code and a wrapper method). However, the constructors of java.lang.Object must not be rewritten. Special attention must be taken to the first accounting block of constructors, that must invoke the super() constructor before any other instruction. • The static initializers are rewritten using the CPUAccount.getcOrCreate() method, except for java.lang.Object and java.lang.Thread.
Chapter 2
Implementation of the rewriting tool In Chapter 1 we have described in detail all the necessary modifications that are needed to introduce the accounting code in any Java class at byte-code level. In this chapter we describe how these modifications are actually performed. For this, we will need to access and be able to modify the following information in classes: • add/replace methods (to add wrapper methods) • modify method signature (to add the accounting argument) • manipulate instructions in the method body (to add the accounting instructions and modify invocations) • update instruction offset (to update the branches after the insertion of account instructions) • modify the constant pool (to update all the new information related to the accounting) • add local variables and set their scope • add new fields in classes • update the scope of exceptions handler (from the exception table) All these modifications need complex manipulations in the class file. For example, when new instructions are added to the body of a method, there are several references that must been updated. Adding new references to methods or references to new classes. It requires also the modification of the contact pool. In the case of the creation of a new version of a method to which an additional accounting object is added, the signature of the class must be updated in the constant pool. The modification of a method invocation, to call the method with the additional accounting object, requires adding the instruction for pushing on the stack before the modified invocation. This needs also to know the number of actual arguments of the method, as well as their type (since depending on the type, different slots are used in local variables). The insertion of accounting instructions will also modify the scope of local variables and exception handlers, and of course it will be necessary to update the exception table as well. Furthermore, the creation of accounting blocks requires also the analysis of methods bodies and to obtain structural information about classes and methods. There are several low-level byte-code engineering frameworks written in Java (e.g., BCA [14], JOIE [8], BIT [15]), as well as higher-level frameworks, such as e.g. Javassist [7]. We have choose BCEL (Byte Code Engineering Library, formerly called JavaClass) [10] for the implementation of a bytecode rewriting tool for extraction of resource information. BCEL allows fine-grain manipulation of Java byte-code and is entirely written in Java. Our choice is justified since BCEL is one of the most 23
24
mature byte-code instrumentation frameworks and provides a powerful and intuitive API that is well adapted for our requirements. It allows arbitrary modification of class files, even the construction from scratch of new classes at byte-code level. This chapter is organized as follows: Section 2.1 gives a concise description of BCEL. We concentrate in the features that we used for the implementation of the tool. Section 2.2.1 presents an overview of the design of the tool. Section 2.2 exposes the whole rewriting process, from the creation of the CFG for each method to the creation of ”redirection or wrapper” methods, this can be called the main section of this thesis because resumes the rewriting tool implementation.
2.1
Byte-Code Engineering Library (BCEL)
BCEL is a general-purpose framework for the static analysis and dynamic creation or transformation of Java byte-code. The framework consists in a ”static” part and a ”generic” part. The “static” is not intended for byte-code modification and can be use to analyze Java classes directly from their classfile. The generic (or more precisely generating) part supplies an abstraction level for the creation and transformation of class files in a dynamic way. In the following we introduce some notions that are used in BCEL. A complete description of the API can be found in [10].
2.1.1
Static part
The static part represent Java class files, where all binary components and data structures declared in the JVM specification are mapped into classes i.e. every element of a class file such as the access flags, the constant pool, the methods and fields, the code, exception table, etc. have an associated class in the static part of BCEL. The class JavaClass that is the top-level data structure gives access to all the information about a classfile (basically fields, methods, symbolic references to the super class and to the implemented interfaces of the represented class). A parser object parses the binary classfile and creates the associated JavaClass object. BCEL provides also a Repository class that allows to read class files and create JavaClass objects. In a Java classfile, the constant pool serves as some kind of central repository and then is a very important component. It contains for example the entries describing the type signature of methods and fields. It also contains strings, integers and other constants. The byte-code instructions may contain indexes to the constant pool as well as other components of the classfile in the constant pool entries themselves. ConstantPool objects contains then an array of Constant entries that can be retrieved using an integer as index. Using the static part of BCEL it is possible to perform class analysis (such as the accounting block analysis or the structural analysis for native methods described in Section 1.3.2).
2.1.2
Generic part
The generic part of BCEL allows modifying of byte-code components dynamically. It supplies an abstraction level for creating and transforming class files dynamically. All static information of a class file is rendered generic in this part of BCEL, for example, the ConstantPoolGen class offers methods for adding, updating, removing different types of constants. The class ClassGen gives an interface to add methods, fields and attributes, and also to create, remove and replace a method. MethodGen class offers the possibility of creating and modifying a method. Generic fields and methods: FieldGen objects represent fields. Fields may optionally have an initialization value if they have access right static final, i.e. if they are constants. A generic method contains methods to add local variables, exceptions that the method may throw, exception handlers, and allows rendering of a final method with a final constant pool.
25
Instruction objects: By modeling instructions as objects, BCEL allows the programmer to have sufficient level of abstraction upon control flow, without handling details like concrete byte-code addresses. An instruction object basically consists of a tag, i.e. an opcode (the actual operation code of the instruction) and its length in bytes. Instructions are grouped via sub-classing, for example goto and ifneq are both branch instructions and have their correspondent classes GOTO and IFNEQ classes. The same occurs for other groups, such as local variable instructions (e.g. iload, aload), return instructions, stack instructions, etc. Instruction list: An instruction list is implemented by a list of instruction handles that encapsulates instruction objects. References to instructions in the list are thus not implemented by direct pointers to instructions but by pointers to instruction handles. This fact makes easier appending, inserting and deleting of byte code areas. Since only symbolic references are used, computation of concrete byte-code offset does not need to occur until the instrumentation process is finished. The instruction handles are organized in a bi-directional chain. Appending instructions to other instruction list anywhere in an existing list is possible. The instructions are appended after the given instruction handle. All append methods returns a new instruction handle that may be used as the target of a branch instruction. Inserting instructions is also possible anywhere into an existing list. The instructions are inserted before the given instruction handle. As before, insertion methods return a new instruction handle, which may be used as a target Deleting instructions is done by removing all instruction handles and the contained instructions within a given range. When deleting an instruction, it is possible that there are still instruction targeters referencing one of the deleted instructions (any instruction can be a target of an exception handler or any branch instruction). To handle such cases, an exception (TargetLostException is thrown when the delete method is invoked in the instruction list and there are still references to the instructions to be deleted. Then, it is possible to handle this exception, by updating all the targeters to a new instruction handle. Instruction targeters: Because exception handlers and local variable contains references to bytecode addresses, they also take the role of an instruction targeter. This means that they contain a method updateTarget to redirect a reference. Generic (non-abstract) methods (i.e. that have an implementation) refer to instruction lists that consist of instruction objects (in BCEL, every byte-code instruction is mapped to an object). The references to byte-code addresses are implemented by handles to instruction objects. Local variables are represented by LocalVariableGen class that hold two references to instruction handles in the instruction list. These handles (start and end) define the scope of the local variable. Exception handlers, represented by CodeExceptionGen, reference the start and the end instruction handles of a try block and also the handling code instruction handle. Finally, any branch instruction (BranchInstruction class) refers to an instruction handle representing the target of the branching. Thus, there are three kinds of instruction targeters (implementing the InstructionTargeter interface) are: local variables, exception handlers and branch instructions.
2.1.3
Section Summary
The above condensed-description of both static and generic parts of BCEL shows the basic functionalities that can be used to implement the analysis and instrumentation for resource information manipulation. We wanted to give an overview of the possibilities that BCEL provides and give some ideas about how constant pool, fields, methods, instructions objects, instruction lists, local variables, exception handlers, etc. can be instrumented. The next section details the actual usage of BCEL in the implementation of the rewriting.
26
2.2
The rewriting tool.
This section intend to explain the design of the rewriting tool using the precedents defined policies, and shows an overview of how needs to be the process of rewriting, step by step.
2.2.1
Design of the rewriting tool.
The tool has been designed to perform accounting block analysis, perform structural analysis, create the associated Control Flow Graph (CFG), introduce optimization strategies, perform all the instrumentation on byte-code following the rewriting rules for accounting, and generate ready to load and execute byte-code. The tool is able to instrument both normal Java application as well as JDK shared classes. Both static and generic parts of BCEL are used to implement the complete rewriting of byte-code. The rewriting tool is composed by the following classes: CPU, CFGAlgo, Graph, RootNodes and Node, that are the main classes for rewriting process. Optimizations are handled by classes SelectOpt, Optimization, O1,.., On. In the case of JDK rewriting the Analyzer class is used to create the list of methods that will not have a ”wrapper”. The CPU class is the main rewriting application and uses as input the list of classes that must be rewritten, as well as a list of classes and methods that require special manipulation. The Node class does the actual insertion of the accounting code because, each node has a different strategy to insert the accounting code. The SelectOpt class selects optimizations with being applied according to a configuration defined before the execution of the tool. Each optimization defines which will be the nodes that will receive ”accounting code” and which not.
Figure 2.1: UML diagram of the rewriting tool for CPU accounting
27
2.2.2
Summary of the rewriting process
Once CPU receives the list of classes to instrument, these classes are retrieved from BCEL Repository and returns the associated JavaClass object containing all the information about the classfile. In order to allow the actual instrumentation of the class, a ClassGen and the corresponding ConstantPoolGen objects are obtained as shown in Figure 2.2. Then, for each method in the class, a MethodGen object is created. This object will be used for all transformations that are necessary in the method, such as modification of the signature, instrumentation of the instructions, modifications of targeters scope, etc. The first modification to the mgen object, that was created using the original method defined in the class, is the modification of its signature. This is done to add the new accounting object as additional argument, and also to add this object as a local variable of the newly instrumented method. For the actual instrumentation of the method body (i.e. if the method is not abstract, native or belongs to an interface), the InstructionList associated to the method is retrieved and will be used for the actual creation of the accounting blocks and the associated Control Graph Flow. The Graph class allows setting of an accounting strategy for the creation/modification of the CFG. The CFG is created containing the Node objects for each accounting block. Node object typically contains a reference to the instruction handle corresponding to the first instruction of an accounting block, and the length of the accounting block. Using this information it is then possible for example to find the instruction handle of the last instruction of the block, by moving through the corresponding instruction list. For each accounting block, if the invocations in the method body require to be rewritten (depending on the rules defined for the method), i.e. if it is necessary to add the accounting object as additional argument, the adjustInvocations method is called. This method will add the necessary code to push the accounting object reference before the method invocation, insert the new signature of the invoked method in the constant pool, and perform all the necessary updates such as the targeters updating and recalculating the accounting block size. The insertion of the actual accounting code instructions is made for all the accounting blocks in the graph. The addAccount method will take care of adding the byte-code instructions for accounting. Once again the insertion of the new instructions handles for the accounting requires updating of the block size and handling particular cases as the accounting of constructors. Once all the modifications performed in the body of the method, the new instrumented method body replaces the old one (using the setInstructionList method).Then, the original method can be removed from the class and the new instrumented method can be added to the ClassGen as well as the reference to the new method with the modified signature. The creation of wrapper and redirections will be performed after the instrumentations of the method body. This requires for example to update the information in the constant pool about the new method (with the accounting object) as well as special manipulations in the case of native methods. Once all the instrumentation to the class file terminated, the modified class can be generated using the modified ClassGen and the modified constant pool containing all the inserted information about accounting. Finally the class is ready to be loaded and executed (in the code shown above, the classfile is dumped in the disk using the same name as the original class). Let us describe the five principal parts of rewriting: (a) Adding accounting information to the class, (b) adjusting invocations in method bodies, (c) adding accounting instructions (d) replacing the original method with the instrumented one and (e) creating redirection with method wrappers.
28
JavaClass jcl = Repository.lookupClass(classname); // the generic class based on the byte-code ClassGen cl = new ClassGen(jcl); ConstantPoolGen cp = cl.getConstantPool(); // for each method defined in the class Method[] ms = cl.getMethods(); for (int i = 0; i < ms.length; i++) { // create a method that can be manipulated MethodGen mgen = new MethodGen(ms[i], classname,cp); modifySignature(mgen); // obtain the instruction list to instrument the method body InstructionList il = mgen.getInstructionList(); if (il != null) { // if the method has a body // create the Control Flow Graph and add accounting byte-code Graph gr = new Graph(new RootNodes()); gr.createCFG(il); // perform instrumentation following the rewriting rules gr.adjustInvocations(cp, il, cl, mgen); gr.addAccount(cp, il, cl, mgen); // update method body with the instrumented one mgen.setInstructionList(il) } // remove the original method and put the new instrumented one cl.replace(ms[i], mgen.getMethod()); cp.addMethodref(mgen); // add wrapper if needed if(requireWrapper()) createWrapper(cl, cp, mgen); } // instrumentation finished, the modified byte-code can be generated jcl = cl.getJavaClass(); jcl.setConstantPool(cp.getFinalConstantPool()); jcl.dump(classname);
Figure 2.2: Rewriting at byte-code level
29
2.3
The accounting process for a method
In this section, a detailed description is made for explain how the accounting process for each method of a class is actually implemented by the rewriting tool. This process is composed by two phases: (1) Analyzing the byte code for construct a CFG and (2) the rewriting process that is explained with all details.
2.3.1
Creation of the Control Flow Graph (CFG)
For each method , it is necessary to create a CFG, the class CFGAlgo is charged to do it on the basis of the instructions list of the method: InstructionList il = mgen.getInstructionList(); if (il != null) { CFGAlgo algo = new CFGAlgo(); Graph gr = algo.calculGraph(il); } The class that represents the CFG is Graph; it is composed of a table of root nodes. Each root node represents a sub-graph. The first sub-graph represents the normal execution flow. The other sub-graphs do not form part of it, e.g. the exceptions handlers. For the construction of the CFG, the instructions list is sequentially examined in order to search all instructions that change the flow of execution, it is the case of the branch instructions (GOTO, JSR, IF, TABLESWITCH, LOOKUPSWITCH etc.), returns of procedure (RETURN, RET) or explicit beginnings of exceptions (ATHROW). These instructions determine the end of a graph node, and also the beginning of other nodes. Which is the case of branch instructions, they can continue the execution in several other nodes. The instructions of type GOTO and JSR have only one successor node, the IF instruction always has two successor nodes, the instructions of type SELECT (TABLESWITCH LOOKUPSWITCH) have at least a successor (the default), but they can be more than two. ATHROW has no successor because the successor node can be in the same method, in the caller method or higher in the call sequence. RETURN instructions do not have successors. The instruction RET can have several successors; because several instruction JSR can jump to the sub-routine terminated by RET instruction. Thus, to find the successor, we use a hash-table, where the key is the index of local variable that the instruction target of JSR use to keep its position, and the value is a vector with the instructions that succeed the JSR instructions in the instruction list (i.e. JSR+3). Since JSR instruction is used in try-catch-finally blocks, the sub-routine will always follow the execution flow. The sub-routine can then be divided in nodes. Thus, sub-routines cannot begin a sub-graph.
2.3.2
Adding accounting information
The original application is not aware about accounting, and no information related to it is available in the classfile. The information about the new introduced accounting object must be added in the class and it must be added as local variables in methods receiving the accounting argument. Insertion of accounting information in classfile Since we are adding new information about the accounting object in the rewritten classes, it is necessary to insert new symbolic information in the constant pool of the class. This is necessary in
30
order to ensure that the generated class will contain the correct references for loading and execution. As shown in Section 1.2.2, the CPUAccount object contains a volatile integer value to store the consumption (the usage field). Thus, it is necessary to add to the constant pool a new entry about the class CPUAccount and also about the usage field that will be used. For this, the ConstantPoolGen class provides addClass and addFieldref methods. Notice that the later requires three arguments: the name of the class where the field belongs, its name, and its signature (“I“ for integer). Similarly, it will be necessary to add information about methods that will be invoked in the CPUAccount class, such as getOrCreate. The addMethodref will insert the necessary information about the method. The signature of the method “()Ltools/rc/cpu/CPUAccount;” is passed as argument, meaning that the object takes no argument and returns a CPUAccount object. protected static String classAcc = "tools.rc.cpu.CPUAccount"; protected static String methodAcc = "getOrCreate"; protected static String consAccSig = "()Ltools/rc/cpu/CPUAccount;" . . . // adding CPU related information cp.addClass(classAcc); cp.addFieldref(classAcc,"usage","I"); . . . cp.addMethodref(classAcc, methodAcc, consAccSig);
Adding the accounting object as new argument The accounting object must be added as a new local variable of a rewritten method. For this, when the method is rewritten with the additional accounting argument, it suffice to add the accounting object type information as the last argument type of the new method. The MethodGen class will use this information to create the associated local variables corresponding to each argument of the method. Type objtype = new ObjectType(classAcc); Type[] margs = mgen.getArgTypes(); Type[] newtypes = new Type[margs.length+1]; mgen.setMaxLocals(mgen.getMaxLocals() + 1); newtypes[margs.length] = objtype; mgen.setArgTypes(newtypes);
If the method is static, the arguments will be added as local variables starting at position 0, elsewhere an implicit reference to this will be always placed as the first local variable. The index number of the local variable will be used to reference the local variables. Thus, for example, to push a reference to this in the stack (in a non-static method), the aload 0 instruction is used. Notice however that there is not a one-to-one mapping between the number of arguments and the index of local variables. Double and Long basic types use two slots. It is then necessary to compute the exact local variable slot in which the accounting argument is placed, to correctly pass it to the rewritten invocations in the method body. Once that the slot occupied by the new argument in the local variables table is calculated, it is necessary to change the references of local variables in the byte code level, which were not arguments, i.e. if a local variable had as slot index 5 now it must be 6. This is because the change of the types table of arguments is made on the BCEL classes level and not on byte code level. The change on byte code level must be done for all the instructions of indexed type that have references in local variable table.
31
2.3.3
Adjusting invocations in the method body.
For the modification of the method invocation in the method body, all the byte-code instructions in an accounting block must be analyzed. All InstructionHandle starting from the first handle in the node (the first field) until the last instruction of the block (found at blockSize instructions later) are inspected. If the instruction is not an invocation i.e. if the Instruction object associated to the current InstructionHandle is not an instance of InvokeInstruction there is nothing to do, and we can process the next instruction. If it is an invocation, we have to check if the invocation must or must not be rewritten (see the code below); e.g. invocations to native method, or if the target class belongs to a particular package, or is a static initializer, etc. This information is provided for example by the Analyzer or can be configured. int nbInstrAdd = 0; // number of instructions to add. InstructionHandle ih = first; // first instruction of the accounting block for (int i=0; i 0) successors Bi (1 ≤ i ≤ n) in the control-flow graph. We denote the accounting size attributes of A and Bi as a and bi , the minimum accounting size min bi as bmin , and the maximum accounting size max bi as bmax . 37
38
A1 size = a 1
A1
A2 size = a
size = a 1+ b 2
A2 size = a + b 2
...
...
An size = a + b
An size = a n
n
B
B
size = 0
size = b
CFG
Figure 3.1: Optimization 1
3.1.2
O2
If all Bi are different from A, and for each Bi the only predecessor is A, then a is incremented by bmin and all bi are decremented by bmin . Consequently, the value of at least one bi becomes zero.
3.1.3
O3
If all Bi are different from A, and for each Bi the only predecessor is A, and the difference bmax −bmin does not exceed a given threshold T , then a is incremented by bmax and all bi are set to zero. Less formally: If the values of the accounting size attributes of successor blocks are not too much different, the common predecessor block accounts for the longest successor block. This optimization is an aggressive version of rule O2. The threshold controls the aggressiveness of this optimization. A threshold T means that a thread executing a block Bi may be charged for up to T byte-code instructions, which it did not execute. In general, T should not be smaller than the number of byte-code instructions necessary to update the CPUAccount object (a thread would be charged for the update instructions, if the optimization was not applied). In order to find effective values for the threshold, we can perform static analysis of typical Java programs (the smallest value T allowing to avoid a significant fraction of the accounting code).
3.1.4
O4
In general, a block with a positive accounting size requires accounting instructions to load, update, and store the usage field of the CPUAccount object. This is done by basic accounting instructions (see Table 3.1), as has been described in Section 1.2.2. This implies that for each accounting block, two accesses to the accounting object field are performed (getfield and putfield instructions). In order to reduce the number of access to the object field, we introduce a local variable localUsage caching the value of the usage field. This
39
A
A
size = a + b
size = a
Bn
B1 size = b
1
Bn
B1
...
...
min
...
...
size = b - b 1
size = b n
size = b - b n
min
min
Bk
Bk size = b k
size = b - b k
=0
min
CFG
Figure 3.2: Optimization 2 // source code cpu.usage += blocksize; // corresponding byte-code 4 aload CP U Account object ref erence 5 dup 6 getfield CPUAccount::int usage 4 bipush blocksize 6 iadd 12 putfield CPUAccount::int usage
Table 3.1: Basic accounting code (6 instructions)
avoids the need of reloading the usage field in every accounting block and we can keep it locally in the local variable, that is a faster operation. The following algorithm marks exactly those accounting blocks in the CFG that have to reload the usage field of the CPUAccount object. All other blocks may directly update the localUsage variable and propagate the new value to the usage field of the CPUAccount object. The marking algorithm • Initially, we mark the first block in the method, in each JVM subroutine, and in each exception handler. • If a block contains a method/constructor invocation, all of its successors in the control-flow graph are marked. This is necessary because the accounting value is changed after the method invocation, and then the local variable is not longer up to date. • If a block with an accounting size attribute of zero is marked, all of its successors have to be marked as well. This is necessary since accounting blocks with zero size do not receive any
40
A
A
size = a + b
size = a
B1
... Bn
B1 size = b
1
... Bn size = 0
size = 0
size = b n
B
2 size = b
max
B2 size = 0
2
CFG
Figure 3.3: Optimization 3 // source code localUsage = cpu.usage + blocksize; cpu.usage = localUsage; // corresponding byte-code 0 aload CP U Account object ref erence 1 getfield CPUAccount::int usage 4 bipush blocksize 6 iadd 7 istore localU sage 8 aload CP U Account object ref erence 9 iload localU sage 10 putfield CPUAccount::int usage
Table 3.2: Updating local variable and usage accounting value (8 instructions)
accounting code and their successors must reload the accounting value. The algorithm terminates, if no further blocks can be marked. Marked nodes should initialize the localUsage value with the current accounting value of the CPUAccount object (getfield and istore), and then store the updated value in the accounting (putfield) (see Table 3.2). Two more instructions are needed as in the basic accounting to store the accounting value in the local variable. Non-marked nodes, on the other hand, only update the local variable, and the accounting usage value in the CPUAccount object (putfield) (see Table 3.3). The usage of local variable in fact is only interesting when there are sequences (and cycles) of non-marked nodes (see non-marked nodes in Figure 3.1.4). Since method invocations are very frequent, most of the nodes will be marked. As a consequence, the majority of the marked nodes will in fact store the value in the local variable that will be not used. To avoid not useful storing operations in the local variable, we can refine the marking algorithm.
41
// source code localUsage += blocksize; cpu.usage = localUsage; // corresponding byte-code 11 iinc CP U Account object ref erence blocksize 15 aload CP U Account object ref erence 16 iload localU sage 17 putfield CPUAccount::int usage
Table 3.3: Updating local variable and setting usage value(4 instructions)
First, we use the marking algorithm described here below to mark the nodes (M1). In a second pass we mark (M2) all the marked nodes that have no non-marked successors. i.e. : M 2 = {x ∈ M 1 | ∀y ∈ pred(x), y ∈ M 1} where pred(N ) = {x ∈ CF G | x precedes N } : predecessors of N This means that M2 marked nodes do not need to store the local variable because it is not used anyway. Such marking define three isolated regions in the CFG (see Fig: 3.1.4) (a) the local variable region, where only updating local variable is needed, (b) the boundary region, that needs both reloading the accounting value and storing it in the local variable because they define entering to the local variable region and (c) the basic accounting region that loads and stores the value in the accounting object without storing it in the local variable. Using such marking, we ensure that no unnecessary local variable store operations are performed. local variable regions
first
localUsage += blocksize cpu.usage = localUsage
M1 boundary regions
localUsage = cpu.usage + blocksize cpu.usage = localUsage
invoke ...
M1
basic accounting region
cpu.usage += blocksize invoke ...
M2 size = 0
M2
M2
Figure 3.4: Optimization 4
3.1.5
Combinations and heuristics
The optimization rules O1, O2, and O3 aim at combining the accounting for a set of blocks that represent conditional statements, but they do not allow to remove the accounting code from loops. For instance, rules O1 and O2 (or alternatively, O1 and O3) may be applied to optimize the accounting for if-else statements. However, these rules are not sufficient to reduce the accounting overhead for if statements without a matching else.
42
In general, multiple optimization rules can be applied to a given control-flow graph. The order of application is important, since it may affect the quality of the accounting code. Most importantly, the optimization algorithm must ensure termination. In particular, certain loops allow an infinite application of rule O1. The following heuristics help to guide the optimization process: • An optimization rule may be applied only if the application increases the number of blocks with an accounting size attribute of zero. Since the number of blocks in a method is finite, obeying this rule ensures termination of the optimization algorithm. • Optimization O1 shall be applied before optimizations O2 and O3. • Optimization O3 shall be applied before optimization O2. There is no need to apply optimization O2, if optimization O3 (which is more aggressive) succeeds on a certain node in the control-flow graph. • If there are leaf nodes in the control-flow graph, they should be considered first, afterwards their predecessor nodes, etc. While optimizations O1, O2, and O3 aim at removing accounting code from certain blocks, the rule O4 helps to reduce the overhead of accounting by caching the counter maintained by the CPUAccount in a local variable. This optimization improves performance only for certain JVM implementations (measurements are given in Section 4.2). Optimization O4 must be considered after application of the rules O1, O2, and O3.
3.2
Implementation of optimizations
As has been described before, each optimization has its own particularity. However, for the first tree optimizations: O1, O2, O3 there are some similarities that can be taken into account for the implementation: • They are structural optimizations, i.e. they do not depend on the presence of instructions other than those that define the nodes in the CFG. Contrarily, O4 depends on the presence of method invocations to change the way that the insertion of the accounting code will be done. Another similarity is that they change the blockSize field on the node, in opposition to O4. • Another similarity is that the changes that they made in the CFG are applied to a group of nodes that are neighbors. This means that a modification in a group of neighbors can be used in the modification of another group that precedes or succeeds that group e.g. in O1, the predecessors of a node (which is their only successor), will have their blockSize field incremented by the blockSize of their successor node. Furthermore, these predecessors can be themselves unique successors nodes of others predecessors, and their newly changed field will be used for compute the optimal block size. • For traversing a CFG, we use the Depth First Search (DFS) algorithm. All nodes are visited only once, even if there are some backward successors. All optimizations are realized in two passes, the DFS is applied in the first pass in all optimizations, but not realized in the second pass of O2 and O3. DFS in the first pass start always in the root nodes.
43
3.2.1
O1
To implement this optimization algorithm, it is necessary to find the leaf nodes of the CFG. Modifications start from leaf nodes, because they do not depend on the result of any changes to modify its blockSize. So, the first pass is dedicated to the leaf nodes search. The depth is incremented by one, each time that a successor is found. The found leafs are kept in a TreeSet, to be sorted by its depth in the graph. For the second pass, the resulting TreeSet is used to retrieve the leaf nodes, starting the search (an DFS) from the leaf of greatest depth. This ensures that previous modifications will be take into account for future modifications. The second DFS is similar to the DFS of first pass, but this time the search starts at leaf nodes and will continue with theirs predecessor nodes (not successors like in first pass). For each node found, a test is performed to know if it is the only successor of all their predecessors, if so, the modification is executed (see Section 3.1.1). This modification process can be called ”ascending”.
3.2.2
O2 , O3
Implementation of these algorithms are very similar, it is performed in two passes. The first pass finds the nodes that are the only predecessor of their successors, these nodes are stored in a stack. This ensures that precedent changes will be take into account for future modifications in the CFG. First pass starts at root nodes of CFG. This search can be called ”descending”. The nodes stored in the stack are retrieved, and for each one, its blockSize field is changed as well as the blockSize of their successors, according to specifications of O2 (see Section 3.1.2) and O3 (see Section 3.1.3).
3.2.3
O4
This optimization being behavioral and not structural, is implemented differently. This optimization is implemented following the two passes algorithm described in Section 3.1.4. For nodes marking, a field in the Node class is added to determine to which “marking set” the node belongs to (field named ”markedO4”). The first pass traverses the CFG beginning at root nodes to search the nodes belonging to M1. Similarly, second pass starts at root nodes but uses information of the first pass to find the marking set M2. The field ”markedO4” is changed to determine their belonging to M1 and M2. The Node class is modified to allow the conditional addition of the ”accounting block”. The “markedO4” field is tested to know which will be the code to be added to the node. The local variable to be used is declared in the first node of the graph. O4 never changes the blockSize of a node, and then never sets it to zero.
3.2.4
Combinations
In order to allow combination of optimizations in the rewriting tool, a module was added to apply an optimization to a method’s CFG, then the resulting CFG (with reduced accounting blocks) is traversed one more time by other optimizations. This process is done sequentially.
3.3
Summary
In this chapter four optimizations that can be applied to the CFG were described, as well as theirs implementations. We discussed their similarities and differences concerning the implementation. It
44
is possible to conclude that some optimization algorithms are better adapted than others depending on the application structure (i.e. the “form” of theirs methods’ CFGs).
Chapter 4
Evaluations In order to measure the overhead that accounting code introduces, we have applied the rewriting tool to complex Java applications. We have choose to use standard benchmark applications in order to show that our implementation can be applied to arbitrary applications. In this chapter we present evaluations for the basic rewriting algorithm as well as the optimizations described in Chapter 3. These results shows us the importance of reducing the execution overhead, in order to evaluate the impact of CPU accounting in complex applications.
4.1
CPU Rewriting Tool Evaluation
This section presents performance measurements proofing that the overhead due to the completely portable CPU resource accounting implementation (see Chapter 2) is acceptable on modern JVM implementations. As described before, the rewriting tool was designed to account CPU resource of arbitrary Java applications, but the final goal is to integrate CPU accounting for resource control in the reflective execution environment as is discussed in Section 5.2. Here, the instrumentation of complex Java applications is measured, showing that the approach is not only limited to mobile code environments. The results shown here have been performed off-line to both java applications and the JDK library classes. The measurements that are presented here focus on the overhead introduced by resource accounting at execution i.e. the overhead introduced by the execution of the modified code, and not the instrumentation itself. The goal is to show that the price to pay for obtaining resource consumption information is small enough to be applied in complex applications. It also shows that the rewriting rules can be applied to any kind of Java applications and is fully portable. We have chosen to apply the byte-code rewriting tool to highly complex Java applications of the standard benchmarks suite SPECjvm98 [23]. SPECjvm98 contains several Java applications that are used to measure the performances of different JVM implementations. The benchmark programs are fully implemented in Java and do not use any native method. The applications are of different application domains, such as databases, expert systems, data compression, signal processing and compilation. Measurements were performed on a Linux platform (Athlon AMD 1200MHz clock rate, 256MB RAM, Linux kernel 2.4.2) with IBM’s JDK 1.3 implementation, which includes one of the best Just-in-Time compilers currently available. We measured the overhead due to CPU accounting in three different configurations as shown in Figure 4.1.: • Ubench -Ujdk : Unmodified benchmarks on an unmodified JDK. • Rbench -Ujdk : Rewritten benchmarks on an unmodified JDK. • Rbench -Rjdk : Rewritten benchmarks on a rewritten JDK. 45
46
JVM98 benchmarks
CPU-aware JVM98 benchmarks
JDK libs
JVM Unmodified
CPU-aware JVM98 benchmarks
CPU-aware JDK libs
JDK libs
JVM Modified application
JVM
Modified application and shared libraries
Figure 4.1: Different configurations of SPECjvm98 (unmodified and rewritten) Modern JVMs allow the use of user-defined library classes (JDK), which can be specified when the JVM is started (using -Xbootclasspath option). This possibility allows the user to provide its own implementation of the shared library classes that are loaded by the system class loader and are shared by all the application. As we have discussed in Section 1.3.2, the rewriting of JDK shared library need special treatment for native methods and that is correctly handled by the rewriting tool. The introduction of accounting in JDK classes allows have more accurate accounting about the CPU consumption in both the application and the shared libraries. This is important since a CPU intensive application can “consume” very low amount of relative CPU resource and invoke method in library classes that execute enormous amount of byte-code instructions. In order to minimize the impact of compilation and garbage collection, all results represent the median of 70 different measurements and the total time (in seconds) for each configuration was computed. About 520 Java class-files were rewritten for the CPU-aware version of SPEC JVM98 benchmarks, and about 5400 class-files for the extended version of the JDK.
mtrt jess compress db mpegaudio jack javac Total
Ubench -Ujdk Rbench -Ujdk Rbench -Rjdk 48.817 (0.00%) 67.111 (37.47%) 68.496 (40.31%) 36.221 (0.00%) 52.735 (45.59%) 59.522 (64.33%) 147.093 (0.00%) 198.995 (35.29%) 199.558 (35.67%) 80.289 (0.00%) 89.036 (10.89%) 110.421 (37.53%) 126.287 (0.00%) 167.659 (32.76%) 167.847 (32.91%) 23.492 (0.00%) 26.541 (12.98%) 35.397 (50.68%) 44.015 (0.00%) 55.956 (27.13%) 78.681 (78.76%) 507.214 (0.00%) 659.033 (29.93%) 720.922 (42.13%)
Table 4.1: Overhead of CPU accounting (time in seconds). JIT and volatile disabled.
Since the usage of volatile accounting variable (the usage field in CPUAccount objects) is of capital importance to ensure coherence in the accounting strategy 1 , we have measured the overhead introduced by the usage of the volatile variable for the accounting. We also measure the impact of the usage of the Just-In-Time (JIT) compiler that reduce very drastically the application execution time and has also impact in the accounting overhead itself. For each measurement, Table 4.1 shows the execution time of the benchmark in seconds (rounded to 3 decimal places), as well as the speedup 1 Without the volatile declaration, the accounting value may be cached in a register rather than been directly updated in the main memory. Thus, the application that uses this value, e.g. a scheduler, may not have the correct accounting value
47
of the original code compared to the rewritten version (rounded to 2 decimal places). These initial measurements were done with the accounting variable not declared as volatile and without the JIT compiler. Considering the total time of all benchmarks, the overhead due to the CPU accounting is about 30% for the rewritten version running in an unmodified JDK and of about 42% with the rewritten JDK (the overhead for pure JDK accounting is of about 12%). In practice, such overhead can be restrictive, for example for applications with real-time requirements, limiting the advantage of portable accounting. In such cases, where it is not possible to use e.g. a Just-In-Time compiler, the trade-off between available time for rewriting optimization to reduce the overhead (that requires additional processing time to enhance the rewriting of the code and perform sophisticated analysis) and the time for the execution of the application must be established. Here we concentrate on the impact of a JIT and later we describe some optimizations. Ubench -Ujdk mtrt jess compress db mpegaudio jack javac Total
3.783 5.299 11.605 19.775 4.484 4.120 9.400 58.466
(0.00%) (0.00%) (0.00%) (0.00%) (0.00%) (0.00%) (0.00%) (0.00%)
Rbench -Ujdk no volatile volatile 4.113 (8.73%) 4.517 (19.40%) 5.654 (6.69%) 5.842 (10.25%) 12.630 (8.83%) 13.344 (14.98%) 20.255 (2.43%) 20.443 (3.38%) 5.752 (28.27%) 6.345 (41.50%) 4.350 (5.58%) 4.294 (4.22%) 10.625 (13.03%) 10.643 (13.22%) 63.379 (8.40%) 66.429 (13.62%)
Rbench -Rjdk no volatile volatile 4.341 (14.75%) 4.788 (26.57%) 6.268 (18.29%) 6.415 (21.06%) 12.749 (9.86%) 13.449 (15.89) 23.110 (16.86%) 23.381 (18.24%) 5.706 (27.25%) 6.316 (40.86%) 4.817 (16.92%) 5.033 (22.16%) 12.181 (29.59%) 12.401 (31.93%) 69.172 (18.31%) 72.782 (24.49%)
Table 4.2: Overhead of CPU accounting (time in seconds). JIT enabled As an important enhancement, Table 4.2 shows the overhead due to CPU accounting with the JIT compiler enabled. The overhead of the volatile accounting variable is also measured. We can see that the total time for each benchmark is 10 times faster than the version without the JIT, reported in Table 4.1 (e.g. the total execution time for Ubench − Ujdk is about 58 sec. with JIT and more than 500 sec. for the version without JIT). We can also see that the usage of the JIT reduces the accounting overhead of about 2/3 for the rewritten benchmarks with unmodified JDK (from 30% to only 8%) and the overhead is halved for the rewritten version with the rewritten JDK (from 42% to 18%). Furthermore, the usage of the volatile accounting variable introduces an overhead of about 4% for the version without a rewritten JDK, and an overhead of about 6% for the version with a modified JDK. Figure 4.2 shows the distribution of the overhead that is introduced by the basic accounting of each benchmark, the accounting of rewritten JDK classes, and the usage of volatile accounting variable. We can see that in general, the overhead introduced for JDK rewriting is relatively small compared with the application accounting, excepting for db and javac. We can also see that the overhead introduced by the usage of volatile is small compared with the accounting itself. We have defined some simple optimization rules that reduce the overhead of the accounting. Those optimizations, will serve for example to compensate the small overhead introduced by the volatile declaration. These optimizations as well as the corresponding measurements are described in detail in the following section.
48
25
volatile
23
time (seconds)
JDK account 20
application account
18
basic benchmark
15 13 10 8 5 3 0 mtrt
jess
compress
db
mpegaudio
jack
javac
Benchmarks
Figure 4.2: Summary of the overhead for accounting
4.2
Evaluation with optimizations
We have measured how the overhead is reduced by the introduction of the optimizations described above. The measurements were performed on standard SPECjvm98 benchmarks with the same hardware and software configurations that were used in the original evaluation without optimizations (see Section 4.1). Eleven versions of SPECjvm98 were used for measurements: one version without any rewriting (used as a reference measurement), a rewritten version without any optimization that is used to compare the overhead evolution with optimized versions (i.e. O1, O2, O3 and O4) and with some combinations (e.g. O1+O2, O2+O4, etc.). The different optimizations and their combinations were applied only to the SPECjvm98 benchmark programs and not to the shared libraries (JDK libraries), i.e. the measurements were done with rewritten benchmarks on an unmodified JDK (noted Rbench − Ujdk in Section 4.1).
4.2.1
Disabled Just-in-Time (JIT) compiler
Table 4.3 shows the different measurements without the JIT. For each optimized version of SPECjvm98, the time of execution (in seconds) as well as the overhead (in percentage) is shown. The combinations of optimizations were applied sequentially, i.e. the second optimization is applied to the resulting graph after the application of the first one. Notice that the combinations are not additives, i.e. the overhead reduction after the application of optimizations O1+O3 is not the sum of optimizations O1 and O3 took independently. Since O4 is a behavioral optimization (we use local variables to cache the accounting value) in opposition to the other optimizations that are structural, the overhead reduction depends much more in the nature of the application as well as in the usage of a Just-in-Time compiler. Thus, we can see that applying O4 without JIT may introduce additional overhead rather
49
than reducing it (see in Table 4.3 that all evaluations of O4 introduces a higher overhead than the rewritten version itself). In the following we show that this phenomena disappears by enabling JIT compiler and O4 really reduces the overhead as expected. We have measured which optimizations are capable to better reduce the overhead for each benchmark (see the ”Best opt.” row in Table 4.5). This allows us to know which optimizations can be applied to obtain the lowest overhead. mtrt jess compress db mpegaudio jack javac Normal 48.817 0.00% 36.221 0.00% 147.093 0.00% 80.289 0.00% 126.287 0.00% 23.492 0.00% 44.015 0.00% Rewritten67.111 37.47% 52.735 45.59% 198.995 35.29% 89.036 10.89% 167.659 32.76% 26.541 12.98% 55.956 27.13% O1 66.700 36.63% 50.489 39.39% 197.039 33.96% 86.480 7.71% 156.366 23.82% 26.093 11.07% 55.030 25.03% O2 66.340 35.90% 50.355 39.02% 196.335 33.48% 88.967 10.81% 163.242 29.26% 26.221 11.62% 54.944 24.83% O3 66.895 37.03% 50.144 38.44% 193.029 31.23% 88.166 9.81% 161.116 27.58% 26.162 11.37% 54.494 23.81% O4 68.585 40.49% 57.354 58.34% 215.716 46.65% 90.808 13.10% 180.398 42.85% 27.256 16.02% 58.578 33.09% O1+O2 66.769 36.77% 50.428 39.22% 195.583 32.97% 87.251 8.67% 162.019 28.29% 26.171 11.40% 54.617 24.09% O1+O3 66.369 35.95% 48.509 33.93% 192.401 30.80% 86.314 7.50% 154.490 22.33% 25.999 10.67% 54.608 24.07% O1+O4 67.410 38.09% 56.053 54.75% 212.063 44.17% 87.785 9.34% 168.788 33.65% 26.997 14.92% 57.695 31.08% O2+O4 67.330 37.92% 54.969 51.76% 211.759 43.96% 90.246 12.40% 176.824 40.02% 27.009 14.97% 57.220 30.00% O3+O4 67.787 38.86% 54.396 50.18% 209.219 42.24% 88.380 10.08% 174.158 37.91% 26.790 14.04% 57.270 30.11% Best opt. 66.340 35.90% 48.509 33.93% 192.401 30.80% 86.314 7.50% 154.490 22.33% 25.999 10.67% 54.494 23.81%
Table 4.3: Benchmarks measuring the reduction of overhead by optimizations (time in seconds). JIT compiler disabled
Figure 4.3 clearly shows how the different optimizations (O4 is not shown) reduce the overhead introduced by the accounting. It shows the overhead percentage reduction for each optimization is homogenous. No Optimizations O1
60.00%
Overhead Percentage
O2
55.00%
O3
50.00%
O4
45.00%
O1+O3
O1+O2 O1+O4
40.00%
O2+O4 O3+O4
35.00% 30.00% 25.00% 20.00% 15.00% 10.00% 5.00% 0.00% mtrt
jess
compress
db
mpegaudio
jack
javac
Benchmarks
Figure 4.3: Overhead of optimized versions of rewritten SPECjvm98 in percentage. JIT compiler disabled.
50
The total time as well as the overhead for each optimization has been calculated (see Table 4.4). The overhead (in percentage) is computed based on the execution without any rewriting (i.e. the normal execution of benchmarks). With these values it is possible to globally estimate how the different optimizations behave for all the benchmarks. By calculating the value (1 − OptimizedT ime/RewrittenT ime) ∗ 100, we obtain that the overhead can be reduced of about 2-4% (see the “Reduced Overhead” column in Table 4.4). We can see that optimization O1 allows to reduce the overhead of about 3%, if took independently, followed by O3 and O2. Optimization O1+O3, allows to reduce the overhead of about 4% and the “Best opt.” row, corresponding to the best optimization for each benchmark took independently (see Table 4.3) gives also 4% of overhead reduction, that is sufficient to compensate the volatile accounting (about 5%) (see Section 4.1).
Normal Rewritten O1 O2 O3 O1+O2 O1+O3 Best opt.
Total Times (sec.) 506.214 658.033 638.197 646.404 640.006 642.838 628.690 628.547
Overhead (%) 0.00% 29.99% 26.07% 27.69% 26.43% 26.99% 24.19% 24.17%
Reduced Overhead (%) – 0.00% 3.01% 1.77% 2.74% 2.31% 4.46% 4.48%
Table 4.4: Total overhead reduction. JIT compiler disabled.
4.2.2
Enabled Just-in-Time (JIT) compiler
We have seen that enabling JIT allows to applications to run 10 times faster (see evaluations in Section 4.1). Here, we study the impact of the JIT in the different optimizations that we applied. Table 4.5 shows the different measurements performed to each version of SPECjvm98. The time of execution (in seconds) as well as the overhead (in percentage) is shown for each benchmark. We can observe two interesting things in these measurements. The first is that this time, optimization O4 behaves as expected, i.e. it reduces the overhead. The second is that the optimizations are no longer homogenous as was the case with measurements with the JIT disabled. This phenomena can be better observed in Figure 4.4 that shows the overhead percentage values for each kind of optimization. Thus, for example compress benchmark with optimization O2 and O1+O3, introduces additional overhead (it was not the case without the JIT). Another interesting remark concerns the fact that applying optimizations, in addition to be application-dependent, also is influences by the use of the JIT e.g. an optimization that is the best when applied to one particular application becomes the worst for another one (O3 is the best for jess but the worst for mtrt). Thus, the choice of a particular optimization strongly depends on the application itself and also to the optimizations performed by the JIT compiler. For O3, we used a threshold equals to seven (the number of byte code instructions introduced for accounting + 1). Since O3 is a generalization of O2, O3 gives better results as expected. Notice however, that the application of the simple refinement using the threshold, allows the improving of the optimization considerably, passing from the worst optimization (O2) to the best one (O3). Concerning the partial and the total overhead that the different optimizations allows to reduce, Table 4.6 shows that when the JIT is enabled, optimization O3 gives the best results, followed
51
mtrt jess compress db mpegaudio jack javac Normal 3.783 0.00% 5.299 0.00% 11.605 0.00% 19.775 0.00% 4.484 0.00% 4.120 0.00% 9.400 0.00% Rewritten4.113 8.73% 5.654 6.69% 12.630 8.83% 20.255 2.43% 5.752 28.27% 4.350 5.58% 10.625 13.03% O1 3.987 5.39% 5.481 3.43% 12.564 8.26% 20.054 1.41% 5.637 25.71% 4.127 0.17% 10.099 7.44% O2 4.041 6.82% 5.521 4.19% 12.663 9.12% 20.171 2.00% 5.631 25.58% 4.311 4.64% 10.395 10.59% O3 4.065 7.45% 5.380 1.53% 12.563 8.26% 20.108 1.68% 5.627 25.49% 4.189 1.67% 9.934 5.68% O4 4.049 7.03% 5.796 9.38% 12.617 8.72% 20.080 1.54% 5.683 26.74% 4.186 1.60% 10.086 7.30% O1+O2 4.043 6.87% 5.549 4.72% 12.904 11.19% 20.167 1.98% 5.640 25.78% 4.213 2.26% 10.494 11.64% O1+O3 4.073 7.67% 5.466 3.15% 13.000 12.02% 20.088 1.58% 5.550 23.77% 4.121 0.02% 9.889 5.20% O1+O4 4.053 7.14% 5.501 3.81% 12.598 8.56% 19.986 1.07% 5.616 25.25% 4.220 2.43% 10.247 9.01% O2+O4 3.976 5.10% 5.515 4.08% 12.869 10.89% 20.107 1.68% 5.817 29.73% 4.233 2.74% 10.010 6.49% O3+O4 4.017 6.19% 5.424 2.36% 12.628 8.82% 19.927 0.77% 5.715 27.45% 4.220 2.43% 10.384 10.47% Best opt. 3.976 5.10% 5.380 1.53% 12.563 8.26% 19.927 0.77% 5.550 23.77% 4.121 0.02% 9.889 5.20%
Table 4.5: Benchmarks measuring the reduction of overhead by optimizations (time in seconds).
Normal Rewritten O1 O2 O3 O4 O1+O2 O1+O3 O1+O4 O2+O4 O3+O4 Best opt.
Total Times (sec.) 58.466 63.379 61.949 62.733 61.866 62.497 63.010 62.187 62.221 62.527 62.315 61.406
Overhead (%) 0% 8.40% 5.96% 7.30% 5.82% 6.89% 7.77% 6.36% 6.42% 6.95% 6.58% 5.03%
Reduced Overhead (%) – – 2.26% 1.02% 2.39% 1.39% 0.58% 1.88% 1.83% 1.34% 1.68% 3.11%
Table 4.6: Total overhead reduction
by O1, O4 and O2. These results cannot be generalized, and for other kind of applications, it is necessary to measure which optimization is better adapted. With the JIT enabled, we obtain that the overhead can be reduced of about 1-2% (see the “Reduced Overhead” column in Table 4.6). If the optimizations that give the best results for each benchmark are considered, the overall overhead can be reduced of about 3%. The last row in Table 4.6, shows that the overhead can be reduced to approximately 5% (against 8.4% with no optimizations and 29.93% with no optimization and no JIT), that is very reasonable.
52
No optimizations
30%
O1
28%
O2 O3
Overhead Percentage
25%
O4
23%
O1+O2
20%
O1+O3 O1+O4
18%
O2+O4
15%
O3+O4
13% 10% 8% 5% 3% 0% mtrt
jess
compress
db Benchmarks
mpegaudio
jack
javac
Figure 4.4: Overhead of optimized versions of rewritten SPECjvm98 in percentage.
4.2.3
Estimation of the size overhead Normal Rewritten O1 O2 O3 O4 O1+O2 O1+O3 O1+O4 O2+O4 O3+O4
size (bytes) 1894400 2529280 2508800 2498560 2488320 2529280 2488320 2488320 2508800 2498560 2488320
overhead 33.51% 32.43% 31.89% 31.35% 33.51% 31.35% 31.35% 32.43% 31.89% 31.35%
Table 4.7: Size overhead of different optimizations applied to SPECjvm98
We have measured the size overhead introduced by the different optimizations. Table 4.7 shows the total size (in bytes) of the SPECjvm98 benchmark classes and the rewritten classes (about 520 .class files). We can see that the CPU accounting (without optimizations) increases the application size of about 33%. This is due to the additional wrapper methods, the additional argument that is passed, as well as the accounting byte-code instructions for each accounting block. The application of optimizations O1, O2 and O3 reduces the size overhead of about 1-2%. The application of O4, as expected, do not reduce the size overhead, since it uses a local variable to cash accounting value (that is a behavioral optimization), but does not modify the size of the application. Finally, we can see that for all SPECjvm98 benchmarks, the application of O3 is a good compromise, since it reduces the size overhead, and allows also reducing the execution overhead.
Chapter 5
Applications In this chapter we show concrete examples of how the rewriting tool can be used. A first example shows how the accounting information can be obtained after the rewriting process. This needs to define an “entry point”, in order to visualize the actual accounting value. This simple example also shows which are the (minimal) modifications that are required in order to perform CPU consumption profiling that can be applied to a wide range of applications. A second example concerns the usage of the accounting information that is generated in the context of a mobile agent system. This allows the introduction of resource control mechanisms to fairly distribute the CPU resources among agents and avoid agent to consume more CPU than it is allowed. We do not describe the integration of the rewriting processing itself, but rather the usage of the information that the rewriting tool allows to obtain.
5.1
A simple applet demo
This section describes an application that is accounted for CPU resource consumption. We took a well-know applet application: The ”Sorting Algorithm Demo” 1 in order to extract the accounting information, without any modification of the source code. This application compares some sorting algorithms by showing the progression of sorting data. Our goal is to rewrite the classes that compose the application and show the result of the counter of CPU resource in real time. The application is composed by the following classes: SortAlgorithm that is the super class of all specialized sorting algorithm (BubbleSortAlgorithm,BidirBubbleSortAlgorithm and QsortAlgorithm). The SortItem class is the applet itself. It instantiates the different sort algorithms and displays the sorting process on screen. In this particular application, we rewrote all the sort algorithm classes with the tool. Only the applet class itself (SortItem) was manually modified (at source code level), in order to add the necessary code to “display” the accounting value (i.e. it requires an “entry point”). Of course this modification can also be performed at bytecode level. However, since the applet itself is not rewritten for accounting and only serves as a container, we decide to avoid implementing a new rewriting tool only for this particular application. The classes that implement the sorting algorithms are rewritten without changing any method invocations to JDK classes. This is necessary because in this sample, we are not rewriting JDK classes. All other invocations are rewritten with the CPUAccount argument. Now, all classes 1 http://java.sun.com/applets/jdk/1.1/demo/SortDemo/example1.html
53
54
have ”accounting code” for CPU resource. The SortItem source code is modified and creates a CPUAccount object that is passed to each (rewritten) sorting algorithm. Finally, the modified applet is compiled. This is possible since the sort algorithm .class files contain the methods with the additional accounting object. Each applet has an instance of CPUAccount class and all invocations of application methods are changed to those with CPUAccount argument in SortItem. Figure 5.1 shows the modified execution of the ”Sorting Algorithm Demo”. The counter of CPU resource is shown in real time on the right side of each applet. This sample can be found online at http://abone.unige.ch
Figure 5.1: Accounting CPU resource: demonstration applet.
5.2
Integration to JSEAL-2 mobile agent system.
Since our approach provides a portable solution for resource accounting, it can be integrated any Java-based mobile agent system. In this section we describe how accounting information is used in JSEAL-2 mobile agent system. The resource control model of JSEAL-2, that takes advantages of the tool described in this thesis, can be found in [4]. The final goal of JSEAL-2 resource control approach is to avoid denial of service (DoS) attacks, i.e. avoid abusive usage of resources by un-trusted (or buggy) mobile agents. The resource control model in JSEAL-2 is shown in Figure 5.2. It is based on a hierarchical resource distribution model. This model is fits well with the hierarchical model of Seal-calculus [26] on which JSEAL-2 is based on. JSEAL-2 manages a tree hierarchy of nested protection domains2 which may be either mobile objects or service components. Each mobile object and service executes in a protection domain of its own, called a sealed object or seal for short. The root domain (on top of the hierarchy) receives infinity resources (i.e. no restrictions). Resources are allocated to a domain during its creation, this ”protection domain” can share its resources or donate them to his sub-domains. The root domain and others (like shared applications or services) are considered as ”trusted domains” (i.e. no accounting is needed). Mobile agents are executed in protection domains that are not trusted. Agents may not exceed the limit of resources assigned to their domain. To know which is the current value of resources that an agent uses, the resource accounting tool that we developed is then used. Agents are rewritten to be rendered “resource aware”, 2 the term ‘protection domain’ refers to the concept of a process or task in an operating system, and not to the Java2 JDK class java.security.ProtectionDomain.
55
Fully trusted domains (no accounting needed) RootSeal CPU
sh
e
sp
8 MEM
8
e ar
lit
sp
lit
ar sh
50
20
%
Untrusted application
M
B
lit 7
it 10
sha
re
sp
re
MEM 40 MB
l sp M B
CPU 15 %
a sh
5
%
CPU 5%
MEM 10 MB
Figure 5.2: Illustration of the general resource control model. and then executed in their protected domain. Our tool can be used to control CPU resources but the approach of rewriting byte code can be too applied to control other kind of resources.
5.2.1
JSEAL-2 Scheduler
For CPU control, JSEAL-2 account the number of executed byte-code instructions for each thread running in a domain. A high-priority scheduler thread, which is part of the J-SEAL2 kernel, executes periodically in order to ensure that assigned CPU limits are respected. The scheduler thread calculates the number of executed byte-code instructions for each set of domains sharing a CPU limit by summing up the CPU consumption of all threads executing in a domain in the set. The scheduler compares the number of executed byte-codes with the desired schedule. If a set of domains has exceeded its CPU limit, the priorities of threads executing in these domains are lowered. Each thread is accounted separately. The scheduler is responsible for accumulating the accounting data of all threads executing in a set of domains sharing a CPU limit. For each CPUAccount object, the scheduler thread always stores the value of the counter it has read most recently. The scheduler calculates the difference between the current value and the previously stored value in order to determine the amount of byte-code instructions executed during the last time-slice (because of the lack of synchronization, the scheduler must not reset any CPUAccount object). If a thread has not existed before, the scheduler assumes the previously stored value to be zero. When a thread terminates, its CPUAccount object is not disposed of immediately, but it is maintained until the scheduler has examined it. The scheduler has to deal with an overflow in the counter of a CPUAccount object. The size of the counter must be large enough so that its full range cannot be used in a single time-slice. For current JVMs and a reasonably small time-slice, a Java int is sufficient. However, in future high-performance systems, CPUAccount objects may have to maintain long values.
56
5.3
Other possible applications
Profiling: An important step in software development is profiling, which means the ability to monitor and trace events that occur at run time of an application, the capacity to calculate the cost of those events and the aptitude to attach the cost of events to particulars parts of an application. A profiler has those aptitudes, e.g. a profiler can show which part of an application consumes the most of CPU resource. With our approach of accounting through bytecode rewriting, it is possible to implement portable profilers i.e. to calculate the CPU consumption of methods using the value of a counter before and after a method invocation. Multi-threaded applications can also be profiled by using a counter for each thread. Schedulers: As discussed in Section 5.2.1, our implementation can be integrated in a mobile agent system. The obtained information about CPU resource is used in a special scheduler to increase or to decrease the priority of a thread or to stop it. For a more general purpose in Java-based systems, schedulers can use our implementation for the same purposes, i.e. modify priorities of processes or stop them if they exceed a given limit. Servlets: Another possible application is the resource control in a servlet engine integrated in a web server. Since servlet can spawn many threads to process several client request, a resource accounting by thread can be a solution for resource control, e.g. for billing purpose. Java processors: Newly designed Java processors can benefit from bytecode based accounting for battery resource control in mobile phones. The number of instructions that are executed by applications can be used to estimate the consumption in watts of the application and then perform the control of battery resource. Many other applications where resource control and resource accounting are needed, with the requirement of portability can be imagined.
Chapter 6
Conclusions and Future work Conclusions We have presented the implementation of CPU resource accounting for Java and justified the design choices. We have demonstrated that it is possible to perform modifications on compiled Java applications, and extract information about the CPU resource that they consume directly from its byte code. Even if working at very low level (bytecode) seems to be very difficult, the usage of a framework as BCEL, limits the manipulation complexity, and allows generating powerful software adaptations. We have shown how resource accounting can be performed, and how the extracted information about CPU resource can be taken into account for resource control. This implementation is fully portable for Java-based systems because it is completely written in Java and does not rely in any native library or low-level mechanics of the underlying operating system. No applications source code is needed for its resource accounting, by adding accounting instructions in Java byte code. Portability ensures a large application of this implementation. Seeing all these attributes of our implementation it is possible to conclude that the initial goals were attempted. We also described the use of this implementation in different systems and applications, particularly in resource control for mobile agent systems (JSEAL-2). The execution overhead produced by the insertion of accounting code was measured proofing that our approach introduce acceptable overhead. The overhead can be reduced using simple algorithms of optimization for improve performances of rewritten applications. Then we can conclude that the price to be payed for portability and resource accounting is acceptable if we consider the wide range of application that can be developed around this implementation. Many problems about rewriting generic shared libraries as JDK classes were resolved and all the issues were well described helping in future developments. The initial results of optimization evaluations may help the design of new rewriting optimization algorithms.
Personal conclusion This work allowed me to learn about modern techniques in the design of resource accounting and resource control model, as well as to better understand the importance of resource awareness in mobile agent systems. This work has also showed me the importance of manipulating low-level code, close to assembly (Java bytecode) and also motivated me to learn modern VM internals. 57
58
Future Work The current implementation rewrites bytecode “off-line” and does not have been optimized to be sufficiently fast for rewrite applications “on-line”. An interesting extension is adapting the tool for “load-time” rewriting, in order to facilitate its integration in a system, where load-time modification is crucial. The integration in different mobile agents or active networks systems can be an interesting future work. Another future work concerns the implementation of more flexible configuration strategies using XML language, since currently the tool uses a static configuration mechanism. The applications proposed in Section 5.3 can be implemented in future work, but only the imagination limits the future work to do.
Appendix A
Rewriting Tool Execution This appendix describes how to use and configure the rewriting tool.
A.1
Packages
The tool is separated in three Java packages: tools.rc: This package contains the definitions of classes that are shared by the tools.rc.analyzer.Analyzer and tools.rc.cpu.CPU tools. The first tool produces a list of methods that should not receive a wrapper method. The second one, is the main rewriting tool that inserts the accounting code. The tools.rc.Tool class is an abstract class, which contains the common methods to the mentioned tools. The list of classes to be rewritten is represented by the tools.rc.ClassSet class and the list of methods, which do not receive a wrapper by the tools.rc.MethodSet class. The tools.rc.Config class represents the configuration of the tool, the tools.rc.Printer and tools.rc.PrintStream classes are using for debugging purpose. tools.rc.analyzer: This package contains the tools.rc.analyze.Analyzer tool. This tool can be used tools.rc.cpu.CPU to obtain the list of methods not receiving a wrapper (e.g. for JDK rewriting), or it can be also used independently. tools.rc.cpu: It is the main tool package, containing the rewriting tool. The classes that compose it are shown in Section 2.2.1.
A.2 A.2.1
Usage analyzer
The tool can be used independently of CPU tool by writing in the console: java [options] tools.rc.analyzer.Analyzer [classes]
The [options] are new user-defined system properties that can be specified with -D=. Here is the list of possible options. name value input-list
file name of a text file containing the list of classes to analyze.
nowrapper-list
output text file where the wrapper-less signatures are written. 59
60
For short lists of classes, the class names can be specified as command line parameters. The name of the classes must be a fully qualified name (e.g.: to analyze the standard String class, java.lang.String must be specified). Method signatures will be written in the output file specifying their class. e.g. int floatToIntBits(Float f) in class java.lang.Float will be written as java.lang.Float.floatToIntBits(F)I. An example of use: %> java -Dinput-list=ibmjdk_w2k_classes.list \\ -Dnowrapper-list=ibmjdk_w2k_methods.list tools.rc.analyzer.Analyzer \\ java.lang.Thread
A.2.2
cpu
To rewrite the classes for resource accounting, the CPU is executed as follows: java [options] tools.rc.cpu.CPU [classes]
The options that the tool accepts are: name
value
input-list
file name of a text file that contains the list of classes to rewrite.
nowrapper-list
output text file that contains the wrapper-less method signatures.
rewjdk
this option does not have a value.
optimization
optimizations to apply: O1 , . . . , On .
out
this option does not have a value.
analyze
this option does not have a value.
Options without value are used like boolean variables. If they are defined the value ”true” is assigned. rewjdk: This option specifies the rewriting of method calls to JDK libraries, then the option nowrapper-list becomes mandatory. If rewjdk is not specified, the tool assumes that all the calls to methods in JDK classes will not be modified, and the nowrapper-list option is ignored. out: If enabled, the tool will print debug information in the console. analyze: If enabled, Analyzer will be used, class names in the input-list and those passed as parameters will be given to Analyzer. The result will be then added to the method signature list that receives no wrapper. As for Analyzer, CPU receives the list of classes to be rewritten and the methods that do not receive a wrapper. The class names can also be passed as command line parameters. Optimizations can be specified by ”-Doptimization” option and to apply more than one, they must be separated by commas. Here is an example of using the tool for JDK rewriting: %> java -Drewjdk -Dinput-list=ibjdk_w2k_classes.list \\ -Dnowrapper-list=ibjdk_w2k_methods.list -Doptimization=O1,O3 \\ tools.rc.cpu.CPU java.lang.Thread
To use the tool, the CLASSPATH environment variable must point to both BCEL.jar and RC.jar files.
Bibliography [1] G. Back, W. Hsieh, and J. Lepreau. Processes in KaffeOS: Isolation, resource management, and sharing in Java. In Proceedings of the Fourth Symposium on Operating Systems Design and Implementation (OSDI’2000), San Diego, CA, USA, October 2000. 1, 2 [2] J Baumann, F. Hohl, K. Rothermet, and Strasser M. Mole - concepts of a mobile agent system. World Wide Web Journal, special issue on Distributed World Wide Web Processing: Applications and Techniques of Web Agents, 1998. 1.1.1 [3] Walter Binder, Jarle Hulaas, and Alex Villaz´ on. Resource control in J-SEAL2. Technical Report Cahier du CUI No.TEST124, University of Geneva, October 2000. 1, 1.1.1 [4] Walter Binder, Jarle Hulaas, Alex Villaz´on, and Rory Vidal. Portable Resource Control in Java, The J-SEAL2 Approach. In ACM Conference on Object-Oriented Programming Systems, Languages, and Applications (OOPSLA’01), Tampa Bay, Florida, USA, October 2001. 5.2 [5] Boris Bokowski and Markus Dahm. Poor man’s genericity for java. In Java-Informations-Tage, pages 60–76, 1998. 1.1.1 [6] G. Bollella, B. Brosgol, P. Dibble, S. Furr, J. Gosling, D. Hardin, and M. Turnbull. The Real-Time Specification for Java. Addison-Wesley, Reading, MA, USA, 2000. 2 [7] Shigeru Chiba. Load-time structural reflection in Java. In ECOOP, pages 313–336, 2000. 2 [8] Geoff Cohen, Jeff Chase, and David Kaminsky. Automatic program transformation with JOIE. In 1998 USENIX Annual Technical Symposium, pages 167–178, 1998. 2 [9] Grzegorz Czajkowski and Thorsten von Eicken. JRes: A resource accounting interface for Java. In Proceedings of the 13th Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA-98), volume 33, 10 of ACM SIGPLAN Notices, pages 21–35, New York, USA, OctoberTEST18–22 1998. ACM Press. 1, 1 [10] Markus Dahm. Byte code engineering. In Java-Information-Tage 1999 (JIT’99), September 1999. http://bcel.sourceforge.net/. 2, 2.1 [11] F.-X. Le Louarn. JUM, a Java Usage Monitor. Web pages at http://www.iro.umontreal. ca/TESTlelouarn/jum.html. 2 [12] James Gosling, Bill Joy, and Guy L. Steele. The Java Language Specification. The Java Series. Addison-Wesley, Reading, MA, USA, 1996. (document), 2.3.4 [13] Fritz Hohl. Mobile agent list. projekte/mole/mal/mal.html. 1
http://www.informatik.uni-stuttgart.de/ipvr/vs/
[14] Ralph Keller and Urs H¨ olzle. Binary component adaptation. In Eric Jul, editor, ECOOPTEST’98—Object-Oriented Programming, volume 1445 of Lecture Notes in Computer Science, pages 307–329. Springer, 1998. 2 61
62
[15] Han Bok Lee and Benjamin G. Zorn. BIT: A tool for instrumenting Java bytecodes. In Proceedings of the USENIX Symposium on Internet Technologies and Systems (ITS-97), pages 73–82, Berkeley, DecemberTEST8–11 1997. USENIX Association. 2 [16] Tim Lindholm and Frank Yellin. The Java Virtual Machine Specification. Addison-Wesley, Reading, MA, USA, second edition, 1999. 1, 1.1.1, 1.1.1 [17] Object Space Inc. Voyager 3.0, 1999. http://www.objectspace.com/Products/voyagerORB. htm. 1.1.1 [18] Takahiro Sakamoto, Tatsurou Sekiguchi, and Akinori Yonezawa. Bytecode transformation for portable thread migration in java. In Second International Symposium on Agent Systems and Applications (ASA’2000) and Fourth International Symposium on Mobile Agents (MA’2000) ASA/MA’2000, Z¨ urich - Switzerland, 2000. 1.1.1 [19] T. Suganuma, T. Ogasawara, M. Takeuchi, T. Yasue, M. Kawahito, K. Ishizaki, H. Komatsu, and T. Nakatani. Overview of the IBM Java Just-in-Time compiler. IBM Systems Journal, 39(1):175–193, 2000. 2, 1.1.1 [20] Sun Microsystems, Inc. Java Virtual Machine Profiler Interface (JVMPI). Web pages at http: //java.sun.com/j2se/1.3/docs/guide/jvmpi/index.html. 2 [21] Sun Microsystems Inc. Jini Connection Technology. Sun Microsystems Inc., http://www.sun. com/jini, 1999. 1.1.1 [22] Niranjan Suri, Jeffrey M. Bradshaw, Maggie R. Breedy, Paul T. Groth, Gregory A. Hill, Renia Jeffers, Timothy S. Mitrovich, Brian R. Pouliot, and David S. Smith. NOMADS: toward a strong and safe mobile agent system. In Carles Sierra, Gini Maria, and Jeffrey S. Rosenschein, editors, Proceedings of the 4th International Conference on Autonomous Agents (AGENTS-00), pages 163–164, NY, June TEST3–7 2000. ACM Press. 2 [23] The Standard Performance Evaluation Corporation. SPEC JVM98 Benchmarks. Web pages at http://www.spec.org/osg/jvm98/, 1998. 4.1 [24] Eddy Truyen, Bert Robben, Bart Vanhaute, Tim Conninx, and Wouter Joosen. Portable support for transparent thread migration in java. In Second International Symposium on Agent Systems and Applications (ASA’2000) and Fourth International Symposium on Mobile Agents (MA’2000) ASA/MA’2000, Z¨ urich - Switzerland, 2000. 1.1.1 [25] Christian F. Tschudin. Open resource allocation for mobile code. In Proceedings of The First Workshop on Mobile Agents, Berlin, Germany, April 1997. 2 [26] Jan Vitek and Giuseppe Castagna. Seal: A framework for secure mobile computations. In Internet Programming Languages, 1999. 5.2 [27] Ian Welch and Robert Stroud. Kava - A Reflective Java Based on Bytecode Rewriting. In Walter Cazzola, Robert Stroud, and Francesco Tisato, editors, Reflection and Software Engineering, LNCS, pages 155–167. Springer Verlag, 2000. 1.1.1 [28] Ian Welch and Robert J. Stroud. Using reflection as a mechanism for enforcing security policies in mobile code. In ESORICS, pages 309–323, 2000. 1.1.1