Patterns and Tools for Achieving Predictability and Performance with ...

1 downloads 3559 Views 225KB Size Report
The Real-time Specification for Java (RTSJ) offers the pre- ...... 4.2 Experiment 2: Real-time Java vs. C++. Although we showed our real-time Java ORB to be ...
Patterns and Tools for Achieving Predictability and Performance with Real-time Java ∗† Krishna Raman, Yue Zhang, Mark Panahi, Juan A. Colmenares, Raymond Klefstad {kraman, yuez, mpanahi, jcolmena, klefstad}@uci.edu Department of Electrical Engineering and Computer Science University of California, Irvine, CA 92697, USA

Abstract

tics and defines new memory management models that allow allocation of objects not micro-managed by the garbage collector. RTSJ is therefore suitable for implementing predictable realtime systems. The benefits of development in Java are particularly valuable for distributed real-time systems. Distributed computing has always been especially complex, tedious, and error-prone. Java’s portability and platform independence are attractive benefits for providing the communication layer between heterogeneous platforms in a networked system. Combined with the benefits of Real-time CORBA, which offers a standard interface for developing distributed applications, Real-time CORBA middleware written in real-time Java has potential to significantly reduce the cost and complexity of developing distributed, real-time systems. By using RTSJ’s newly-defined thread scheduling semantics and memory model, CORBA middleware implemented in Java can provide the best of both worlds: a portable, developerfriendly language as well as the guarantee of predictability required by real-time systems. Using RTSJ’s newly-defined threads in CORBA middleware is fairly straightforward. RTSJ’s memory model, however, poses a number of challenges to the Real-time CORBA middleware developer. While significantly simpler than in C++, memory management with RTSJ is necessarily more complex than the automatic memory management of regular Java which sacrifices predictability. Careful design and implementation are therefore required to use the RTSJ memory model correctly to provide real-time predictability while at the same time shielding the application developer from its complexities. We have developed RTZen, the first implementation of Real-time CORBA middleware in real-time Java, with both goals in mind: providing a high degree of predictability while still maintaining ease of use for the application developer by hiding the implementation details of distributed programming in RTSJ. In addition, RTZen must provide good performance, be compliant with both Real-time CORBA and the RTSJ to offer portability, and be customizable for the diverse needs of distributed, real-time applications. Simultaneously meeting all these design goals posed a number of challenges for our research in middleware for distributed, real-time systems. These challenges were met in two major steps: 1. Organize the middleware’s architecture with design patterns to manage memory correctly yet be transparent to the user, and 2. Verify the implementation’s memory usage to ensure that there are no memory leaks that could compromise pre-

The Real-time Specification for Java (RTSJ) offers the predictable memory management needed for real-time applications, while maintaining Java’s advantages of portability and ease of use. RTSJ’s scoped memory allows object lifetimes to be controlled in groups, rather than individually as in C++. While easier than individual object lifetime management, scoped memory adds programming complexity from strict rules governing memory access across scopes. Moreover, memory leaks can potentially create jitter and reduce performance. To manage the complexities of RTSJ’s scoped memory, we developed patterns and tools for RTZen, a Real-time CORBA Object Request Broker (ORB). We describe four new patterns that enable communication and coordination across scope boundaries, an otherwise difficult task in RTSJ. We then present IsoLeak, a runtime debugging tool that visualizes the scoped hierarchies of complex applications and locates memory leaks. Our empirical results show that RTZen is highly predictable and has acceptable performance. RTZen therefore demonstrates that the use of patterns and tools like IsoLeak can help applications meet the stringent QoS requirements of DRE applications, while supporting safer, easier, cheaper, and faster development in Real-time Java.

1 Introduction Real-time systems are complex and costly to develop, maintain, and extend. The Java language has proven to be a convenient and fast platform for application development, but is unable to offer real-time systems’ required Quality of Service (QoS) guarantees of predictability for two primary reasons: 1) the under-specified scheduling semantics of Java threads can lead to situations where the most eligible thread is not allowed to run; and 2) the Java garbage collector can preempt any other Java thread, thus causing unpredictably long preemption latencies. The Real-Time Java Experts Group has defined the RealTime Specification for Java (RTSJ) [3] in order to bring Java’s benefits, particularly ease of use, to the development of real-time systems. RTSJ provides stronger guarantees on thread seman∗ This material is based upon work supported by the National Science Foundation under Grant No. 0410218. † Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.

1

Proceedings of the 11th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA’05) 1533-2306/05 $20.00 © 2005 IEEE

dictability. Section 2 explains how RTZen was organized with design patterns to manage memory correctly and transparently, while Section 3 shows how we developed a tool to verify that the implementation has no memory leaks. The research conducted in these two steps, resulting in the largest open-source RTSJ project, demonstrates that real-time Java and Real-time CORBA are maturing into viable technologies for DRE system development. More importantly, empirical tests, presented in Section 4, show that these specifications can be combined into a single middleware architecture that leverages the advantages of each. The result of achieving these goals is a predictable, efficient, and customizable RTSJ implementation of CORBA that can significantly ease development of complex, distributed, realtime systems.

alive. To allow efficient use of memory for objects with different lifetimes, RTSJ allows scopes to be nested hierarchically. RTSJ-defined scoped memory areas may create multiple child scopes. If region B is entered from region A, then A is considered the parent of B. The resulting nested scoped memory areas produce a tree-like structure called a scope stack, depicted in Figure 1.

2 Using Design Patterns to Manage Memory in RTSJ Middleware

Two major rules govern memory access among scopes, the lifetime rule and the single-parent rule. First, according to the lifetime rule, code within a given scoped memory B can reference memory in another region A only if the lifetime of region A is at least as long as that of the first region B. This lifetime requirement can be guaranteed only if the requested object resides in an ancestor region (e.g., a parent or grandparent), immortal, or heap memory. A violation of this rule results in throwing an IllegalAssignmentError or IllegalAccessError. Secondly, a memory region can have only one parent: a single scope cannot have two or more threads from different parent scopes enter it, thereby preventing cycles in the scope stack. If one thread takes a particular path to get to a memory region and forms a scoped memory hierarchy, a second thread will have to follow the same hierarchy to reach the same memory region; otherwise, a ScopedCycleException is thrown. For example, if a thread enters scope B from A, then another thread that enters B must also enter from A. An important effect of this single-parent restriction on scoping structure is that a given region cannot access memory residing in its “sibling” region. If these two regions must coordinate to perform some task, they must do so through memory stored in a common ancestor region. For example, in Figure 1, scope C cannot access scope B; they can coordinate only via objects stored in their common parent A or in immortal memory. Figure 2 depicts the complete access rules among the scopes illustrated in Figure 1.2 The restrictions imposed on these memory regions pose challenges for designing real-time middleware such as RTZen. First, the application must be designed so that longer-lived objects are created in ancestor scopes while shorter-lived objects are created in appropriate child scopes so that memory allocation, access, and traversal are as efficient as possible. Second, any remaining memory access restrictions (such as sibling scopes needing to coordinate with each other) must be resolved with appropriate design patterns.

B Heap

C

A

Immortal

Figure 1: Nested Scopes.

Managing memory is an important part of programming. Objects consume limited memory resources during their lifetimes, the interval of time from object creation to object destruction. When objects are created, memory must be allocated for them, and this memory must later be deallocated when the object is no longer needed. C++ requires that programmers manage memory for each individual object. This manual memory management is tedious and error-prone; if the programmer forgets to deallocate memory correctly, memory leaks result. Java solved this problem by adding garbage collection, which reclaims unneeded memory automatically. The garbage collector is invoked at unpredictable times, however, causing unacceptable jitter for real-time systems. RTSJ introduces the concept of scoped memory which gives programmers the option to reclaim blocks of memory containing multiple objects. These blocks of memory are known as scoped memory regions. When a scope is allocated, a thread may then ’enter’ that scope to allocate and access objects within that scope. Multiple threads can enter and execute in the same scoped region. Expiration of a scoped region occurs at the moment when no threads are executing in the region. Objects allocated in scoped memory are therefore not reclaimed individually, but rather all at once when the scope containing them is deallocated. The benefit of using scoped memory is that both allocation and deallocation of a single (not necessarily contiguous) block1 are both predictable operations. In addition to regular scoped memory, a special permanent memory area, immortal memory, can be used to store objects whose lifetime is known to be equivalent to that of the JVM. Objects allocated in immortal memory, however, will never be garbage collected. Immortal memory is generally a good place to put caches and pools of objects. Maintaining entire blocks of memory as scopes is less complex to manage than the individual objects of C++. A challenge of managing scopes, however, is not to waste memory by having short-lived objects in the same scope as long-lived objects, since the entire scope cannot be reclaimed if any object is still

2.1

from/to Heap Imm. A B C

Heap – yes yes yes yes

Imm. yes – yes yes yes

A no no – yes yes

B no no no – no

C no no no no –

Figure 2: Access rules for Fig. 1.

RTZen’s Scoped Memory Hierarchy

To create an efficient implementation of Real-time CORBA using RTSJ, CORBA objects must be organized hierarchically. This allows their lifetime needs to be matched to the nested RTSJ

1 While RTSJ supports both linear- and variable-time allocation of scoped memory regions, we strictly use the linear-time allocation mechanism in this work.

2 Table 2 assumes that real-time threads are used; if no-heap real-time threads are used, no references to the heap are permitted.

2

Proceedings of the 11th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA’05) 1533-2306/05 $20.00 © 2005 IEEE

scoped memory areas that contain them. RTZen was designed with the unique scoped memory hierarchy shown in Figure 3. The main purpose of this hierarchy is to enable objects with similar lifetimes to be independently allocated and freed to follow the Real-time CORBA specification. Since CORBA allows applications to create and destroy various CORBA components (e.g., ORBs and POAs), RTZen enables this by assigning scoped memory areas to each of these components. For example, when an application creates a POA, the associated scoped memory is created, then the POA object is allocated within that scoped memory. Each component with a defined lifetime is allocated in its own scope and maintains its state within this scope. Moreover, some components have child scopes for dependent components with smaller lifetimes, thus creating a tree-like scoped memory structure.

preventing the premature reclamation of a scoped memory area by controlling its lifetime. In addition, we have encountered four new memory management complications while developing RTZen, so we created four new design patterns to solve them. 2.2.1 Separation of Creation and Initialization. Context. To use memory efficiently, RTSJ applications typically create some pools of recyclable objects, preallocated in specific memory areas such as immortal memory [2]. Problem. Creation of objects in another memory area requires the use of Java reflection. But, reflection can become memory inefficient when creating objects with parameters because the parameters for the reflection call must be objects. Solution. To solve this issue, the Separation of Creation and Initialization pattern is used. It defines classes with the default constructor that creates uninitialized instances, as well as accessor methods that allow the modification of the object’s internal state (i.e., the configuration), just before they are going to be used. RTZen uses this pattern to (de)marshal requests, as well as to create ORB and POA fac¸ades in the memory pools.

SCOPED MEMORY

DIRECTION OF SCOPE NESTING

TRANSPORT SCOPE Socket

TEMP. SCOPE CDR Stream Ref Request Data Byte Buffer Ref

SERVER SIDE ACCEPTOR SCOPE Server Socket

THREAD POOL SCOPE

TRANSPORT SCOPE

Thread pool

REQUEST PROCESSING SCOPE CDR Stream Ref Request Data Byte Buffer Ref

Socket

POA MEMORY SCOPE POA Impl

Wedge Thread

ORB MEMORY SCOPE Connector registry

Acceptor registry

ORB Impl

Active Demux table

Wedge Thread

BASE APPLICATION SCOPE ORB Facade Ref

POA Facade Ref

Object Ref Delegate Ptr

Servant Implementation

DIRECTION OF LIFETIME INCREASING

CLIENT SIDE

2.2.2

Context. RTSJ programmers often encounter situations in which the calling object needs to invoke an operation on an object allocated in an different scope, such as in a sibling scope. Problem. However, the memory access rules of RTSJ dictate that a given object can be accessed directly only if it is residing in the calling object’s scope stack (an ancestor scope). Therefore, for indirect access to occur, elaborate memory traversal must be performed, in which the control thread must first jump to a scope that is a common ancestor of both objects, then enter the callee object’s region (possibly traversing intermediate regions along the way), and finally invoke the operation. Solution. By using the ExecuteInRunnable class (see Figure 4), the Cross-scope Invocation pattern can simplify the indirect access process. If necessary, this ExecuteInRunnable class can be used repeatedly to perform such a memory traversal. Figures 5 and 6 show the use of this pattern. Assume the simplest case in which B and C are sibling scopes and A is their parent memory region, with B being the current scope (Figure 6). After being instantiated using the default constructor or obtained from a pool, the ExecuteInRunnable object is initialized within the sibling scope C and a Runnable object that contains the logic to be executed in B. Once the executeInArea method of the MemoryArea class is called by B, the ExecuteInRunnable object starts to run in A,

Object Impl

IMMORTAL MEMORY Scoped memory object cache ORB Facade Cache

POA Facade Cache Object Ref Delegate Cache

Figure 3: Scoped Memory Structure of RTZen.

2.2

Cross-scope Invocation.

RTZen’s Design Patterns

RTZen’s hierarchical scoped memory structure is the first, large-grained step in the process of mapping the RTSJ memory model to the needs of a Real-time CORBA application. CORBA objects are organized hierarchically by lifetime so that much of the memory access restrictions are handled by RTZen’s overall memory structure. However, several situations are still encountered in which following RTSJ’s memory access restrictions poses complications for those designing applications. These complications, however, can be mitigated through the use of appropriate design patterns. RTZen uses several existing design patterns3 to alleviate some of the most common difficulties of using RTSJ, such as

thread that enters a scope and blocks, waiting for a signal to exit the area. The Memory Pool pattern [2] is a set of instances of a given class preallocated in a specific memory area (e.g., immortal memory). When an instance of this class is requested, an object is taken from the pool; when the instance is no longer needed, it is returned to the pool. The Encapsulated Method pattern [8] allows for the allocation of objects that represent intermediate results of an algorithm in a temporary scope. After the final result is obtained, the temporary scope is discarded, thereby avoiding unnecessary allocations in the original scope. The Multi-scoped Object pattern allows transparent access of an object regardless of the originating region of the callee. The Memory Block pattern [2] allows the pooling, via serialization, of objects of varying sizes in a byte array block allocated from immortal memory, thus allowing read and write access from any scoped memory and any thread type.

3 The Immortal Singleton pattern [5] is a simple adaptation of the classical Singleton pattern [7]. It allows the creation of a unique instance of a class from immortal memory, allowing it to be accessed from any memory area. The Wedge Thread [2, 8] pattern is used to prevent the premature reclamation of a scoped memory area by controlling its lifetime. It consists of a real-time

3

Proceedings of the 11th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA’05) 1533-2306/05 $20.00 © 2005 IEEE

public class ExecuteInRunnable implements Runnable{ private Runnable r; private MemoryArea a; public void init( Runnable r,MemoryArea a){ this.r = r; this.a = a; } public void run(){ try { a.enter(r);} catch(Throwable ex){...} }}

MemoryArea parent; ScopedMemory sibling; Runnable logic; ... ExecuteInRunnable eir = EIRPool.getEIR(); eir.init(logic, sibling); ... try { parent.executeInArea( eir);} catch (Throwable t) { ... } finally { EIRPool.freeEIR(eir );} ...

The ExecuteInRunnable class.

Figure 5:

Figure 4:

be reported correctly because of an inappropriate mapping between the exception type and the status variable or object (e.g., exceptions are commonly handled using general types); and 4) system performance may be affected since the exception must be re-instantiated several times as it is propagated from scope to scope. Solution. Consequently, we have designed the Immortal Exception pattern, an efficient and flexible solution that allows exceptions to be handled independently of the memory area in which they are thrown, without violating RTSJ referencing rules. In this pattern, a factory class that creates exception objects of specified types resides in immortal memory. The Immortal Singleton pattern [5] is used to cache the exception objects in the factory so that they can be reused (i.e., re-thrown). Distinct families of exceptions, such as CORBA system exceptions and application exceptions, are organized into different factories. This pattern offers important advantages and a minor disadvantage. Since all exceptions are allocated in immortal memory, they can be accessed from anywhere, thereby avoiding the boundary problem. This design is particularly useful when the system must handle a large number of exceptions, such as the 400 instances of CORBA system exceptions handled by RTZen. A limitation of this pattern, however, is that since exception objects are preallocated, no message that explains the cause of the run-time exception can be associated with the exception objects. However, good documentation can alleviate this inconvenience.

Using ExecuteInRunnable.

making the current thread enter C and finally execute the logic provided in the Runnable object. As is common in RTSJ programming, the allocation of arguments and returned values Scope B Scope C of the requested method require special care to avoid illegal access errors: arguments executeInArea() enter() must be accessible from the callee scope, and returned valScope A ues must be accessible from the caller scope. This requirement may add significant code comFigure 6: Invocation between plexity, but this complexity can sibling scopes. be alleviated by the adoption of the Memory Pool and Memory Block patterns [2]. 2.2.3

2.2.4

Immortal Fac¸ade.

Context. A consequence of RTSJ’s scoping rules is that large RTSJ applications, such as RTZen, often have complex scoping structures. Problem. Scoping structures introduce more development complexity to application users. In general, when objects in different scopes interact using method calls, the complexity of traversing the memory structure is exposed to both the caller object and callee object. Furthermore, the caller is typically tightly coupled with the system’s memory structure, in particular with the callee object’s locality. This exposed complexity makes development and system maintenance more difficult and therefore compromises one of RTZen’s design goals. Solution. To hide complexity from the application developer, as well as to minimize the dependencies of the caller object on the callee object’s memory locality, we used the Immortal Fac¸ade pattern based on the Gang of Four’s Fac¸ade design pattern [7]. The Immortal Fac¸ade consists of a fac¸ade class and an implementation class. The fac¸ade class acts as a surrogate for and typically implements the same interface as the actual implementation class. It encapsulates the logic that handles the cross-scope invocation. The fac¸ade objects need to be accessible from scopes of interest, so they are frequently allocated in immortal memory and managed by a pool. The implementation class implements the actual business logic behind the fac¸ade. An instance of it is allocated in a specific scoped memory. In RTZen, two key patterns, Cross-scope Invocation and Immortal Fac¸ade, have been used to hide the complex scoping structures between callers and callees. One example of the combined use of these two patterns is the ORB fac¸ade. RTZen

Immortal Exception

Context. In RTSJ applications, exceptions may need to be thrown and handled in different memory areas. Problem. However, in RTSJ, the propagation of exceptions is restricted by memory access rules. A given exception object must be handled in a memory area that can legally reference that exception. If not, a ThrowBoundaryError is returned and the original exception is lost. As shown in Figure 3, RTSJ’s memory area rules introduce accidental complexity into exception handling. The CORBA specification requires exceptions to be thrown in many scope regions. However, some of those exception objects cannot be handled in their local scopes, yet cannot be legally accessed from the region that can handle them either. For example, an exception raised in the Thread Pool Scope may need to be handled in ORB Memory Scope, but this access is prohibited by RTSJ memory access rules. Corsaro et al. [5] proposed that exceptions can be initially handled in the local scope. With this approach, the notification of the exceptional condition is encapsulated in a status variable or object and then transferred to an outer scope, where the condition is finally handled, or propagated again to an outer scope. Although effective, this approach has the following drawbacks: 1) the code complexity is increased; 2) the exception propagation mechanism is tightly coupled with the system’s memory structure; 3) the actual exceptional condition may not 4

Proceedings of the 11th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA’05) 1533-2306/05 $20.00 © 2005 IEEE

maintains a pool of ORB fac¸ade objects in immortal memory. These fac¸ades do not implement any business logic. All the logic is contained in the ORB implementation object hosted in the ORB scope. Since the ORB fac¸ade is in immortal memory, the user can access it with ease and make invocations on it. The Cross-scope Invocation pattern is used when the invocation thread needs to laterally traverse scoped regions. In conclusion, the first step in ensuring predictability and efficiency of memory usage is to organize the system with design patterns to use scoped memory regions carefully and correctly. RTZen’s high-level hierarchical scoped memory design and application of these four new design patterns enables memory to be used predictably and efficiently, while maintaining compliance with both the Real-time CORBA specification and the RTSJ and while shielding the memory management complexity from the RTZen user. Enabling good memory usage in the design phase is only the first step, however. The second step is to ensure that the implementation itself does not have any memory leaks that would compromise predictability and efficiency while RTZen is actually running.

and recreate all objects. This issue is even larger when it occurs in the immortal memory region since this region cannot be reclaimed without restarting the whole JVM. Such an operation can cause severe jitter for time-critical tasks and is unacceptable in real-time applications. Due to the lack of debugging tools for RTSJ, it is very difficult to visualize the scoped hierarchy of an application or to track down memory leaks. A simple solution is to place print statements in the code to trace memory leaks. However, this solution is tedious and error-prone and can be the cause of even more memory leaks4 . For large applications with a complex memory hierarchy, manually tracing calls may overwhelm the application developer with information. Also, the multi-threaded nature of RTSJ application may add another layer of complexity to this issue. To address these difficulties, we have developed IsoLeak, a runtime debugging tool distributed along with RTZen to help RTSJ application developers visualize the scoped hierarchies of complex applications and locate potential memory leaks. IsoLeak uses BCEL [1], a bytecode engineering library, to instrument RTSJ applications and gather debugging information. The information gathered includes the time of occurrence of the event, the event type (method entry/exit), method name, and memory information about current scoped and immortal memory regions. Along with the memory size information, it gathers information about the scoped memory hierarchy of the application. Figure 7 shows a fragment of source code equivalent to the bytecode instrumented by IsoLeak. IsoLeak adds calls to two methods (logMethodEntry and logMethodExit) that delimit the body of the instrumented method, and activate and deactivate program instrumentation. To process the collected data points, IsoLeak must know when the application has reached steady state. Currently, IsoLeak does not identify application steady state automatically: the application must inform IsoLeak when it has reached steady state. Next, IsoLeak must detect transient scoped regions used by the application. There are two approaches to identifying these regions. The naive approach determines how long the scoped region stays alive before being collected and flags all short-lived scopes as being transient. This approach works for most trivial cases; however, it can mistakenly identify non-transient scopes in some cases (e.g., if the initialization of an object fails within a non-transient scope and the threads exit the scope prematurely). The second approach is to observe the objects allocated within the scoped regions. In RTZen, all non-transient scoped regions have a portal object [8] which points to the implementation object associated with the region, whereas transient scopes do not. IsoLeak uses this information to flag transient scopes. To be more flexible, IsoLeak also allows the developer to use a Java interface to flag the portal object to indicate whether or not the scope region is transient. Finally, after collecting instrumentation data, IsoLeak presents a graph of the scoped memory hierarchy of the application that visually indicates all the regions with potential or confirmed memory leaks. Figure 8 shows the output of IsoLeak after being applied to RTZen. Furthermore, when the user clicks

3 Ensuring Efficient Use of Memory In C/C++ a memory leak occurs when the developer allocates an object and forgets to deallocate it. Standard Java bypasses the problem by introducing a Garbage Collector (GC). Since RTSJ moves away from the concept of garbage collection, the issue of memory leaks is reintroduced. However, the definition of a memory leak in RTSJ is different from that in C/C++ because the RTSJ application developer cannot explicitly reclaim memory allocated for objects. Any allocation in immortal or scoped memory regions has the potential of being a memory leak, especially when the allocated object cannot be recycled. However, not all allocations, even for non-recyclable objects, can be classified as a memory leak. To distinguish a leak from a required allocation, the lifetime of an application can be divided into three phases: initialization, operation, and termination. The initialization phase of an application encompasses all the logic and allocations that are executed before the system reaches a steady state. All the allocations after steady state occur in the operation phase until the application is ready to shut down, at which time it enters the termination phase. Allocation of non-recyclable objects during the operation phase results in memory leaks and should be avoided. If a temporary allocation must be made, however, the application developer should use the Encapsulated Method pattern [8] to execute in a transient scoped region so that the memory may be recovered. In RTSJ applications, memory leaks can cause two types of problems. First, immortal and scoped memory regions are created with a bounded size. Constantly allocating objects without any management, therefore, can cause the application to run out of memory space and throw an OutOfMemoryException. Second, sporadic allocations of objects can also lead to increased jitter. Recovering from memory leaks is an expensive operation. Scoped memory regions can only be deallocated after there are no active threads left in the region. Thus, the only way to recover space in a scoped region is to exit and reenter the region

4 Some Java APIs are not scoped memory safe and may internally allocate memory causing memory leaks.

5

Proceedings of the 11th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA’05) 1533-2306/05 $20.00 © 2005 IEEE

Legend

public void foobar( ... ) { IsoLeak.logMethodEntry(...) ; try { ... } finally { IsoLeak. logMethodExit(...); } }

ORBImpl

POAImpl

Acceptor

Transient

Transport

Transport

Transport

Thread pool

Transport

Transient

Transient

Figure 9: Trace of memory allocation in the POA’s scope (bytes vs. milliseconds)

Figure 8: Scoped memory hierarchy of RTZen generated using IsoLeak tool.

on a region, IsoLeak presents a graph of memory allocations vs. time corresponding to the selected region (shown in Figure 9).

ORB/Platform JacORB/Sun JVM RTZen/jRate RTZen/Sun JVM

4 Empirical Results RTZen’s memory management was carefully organized with design patterns to provide predictability and efficiency. RTZen was then evaluated with IsoLeak to detect any memory leaks in its implementation that would degrade predictability. The resulting RTZen was expected to be highly predictable. In addition, RTZen’s performance was expected to be good, since careful memory management has a side effect of increased efficiency. We compared RTZen with JacORB, a regular Java ORB, to show the predictability improvement possible using RTSJ. We also compared RTZen with TAO, an efficient, predictable, widelyused open-source Real-time CORBA ORB for C++.

4.1

Transient or Unknow

Safe

Application Scope

Transient

Figure 7: Source code equivalent to the bytecode instrumented by IsoLeak

Leak

Jitter 9770 90 650

Min. 935 563 347

Max. 10705 653 997

Median 1026 579 376

CPS 881 1737 2533

Table 1: Jitter, minimum, maximum, and median round-trip latencies in µs; throughput in calls per second.

collector, with which JacORB obtained its narrowest jitter among the four types of garbage collectors supported by the JVM (default, throughput, concurrent low pause, and incremental) [10]. Using the range as a measure of jitter (max-min), RTZen’s jitter is only 90 µs, while JacORB’s is 9,770 µs. This result shows that a carefully designed implementation of an ORB for real-time Java can be highly predictable compared to a regular Java ORB. Performance of RTZen/Sun JVM vs. JacORB/Sun JVM. Although predictability is of utmost importance to real-time systems, it is also important that performance not be unduly degraded at the expense of predictability. We expected that careful memory management, such as extensive memory reuse with memory pools, would lead to better performance as well as lower jitter. As shown in Table 1, RTZen running on jRate is much faster than JacORB. To compare their performance more directly, however, we compared RTZen with JacORB with both running on the same JVM. Without this comparison, we cannot know how much of the performance difference to attribute to the ORBs themselves. In this test, RTZen used Mock RTSJ classes that allow it to run on standard JVMs.5 All scopes and immortal memory regions were therefore simulated as heap memory, and all allocations in those regions were subject to garbage collection. Of the four types of garbage collectors measured, JacORB obtained its highest throughput with the throughput garbage collector. Under the same conditions, as shown in Table 1, RTZen significantly outperforms JacORB: JacORB achieves throughput of 881 calls/s, while RTZen makes 2,533 calls/s. As expected, along with the obvious effect of improved predictability, yet another consequence of careful memory management is improved performance.

Experiment 1: Real-time vs. Regular Java

All experiments were run on 865 MHz Pentium III (Coppermine, 256KB Cache) processors with 512MB PC133 ECC SDRAM, for both server side and client side, connected via 10 Mbps switched Ethernet on a closed subnet. The operating system was TimeSys Linux GPL 4.1 based on the Linux kernel 2.4.21, which supports the Native POSIX Thread Library (NPTL) [4]. The non-real-time Java Virtual Machine (JVM) used for comparison was the Sun JDK 1.4 JVM. The real-time Java platform was jRate [6], a real-time Java ahead-of-time compiler. A single thread ran on the client side, sending variable-size octet sequences (4, 32, 128, 256, and 512 bytes) to the server side. For all tests, measurements were based on 10,000 steady state observations, in which the system runs until the transitory effects of cold starts are eliminated before collecting the measured observations. RTZen’s typical performance and jitter were roughly constant across all message sizes tested, up to 512 bytes, the allocated buffer limit. (RTZen allows the application developer to configure the message buffer size to customize performance and predictability as required.) The same constant performance and jitter were seen in tests of JacORB as well. Therefore, the results that follow are shown for a message size of 128 bytes. Jitter of RTZen/jRate vs. JacORB/Sun JVM. As expected, RTZen is highly predictable compared to JacORB. Table 1 shows JacORB’s jitter when run with the default garbage

5 Mock RTSJ classes expose a reduced set of the RTSJ API and do not perform allocation and access checks.

6

Proceedings of the 11th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA’05) 1533-2306/05 $20.00 © 2005 IEEE

middleware. The first solution was the discovery of design patterns to manage RTSJ’s memory access restrictions. The second solution was the development of IsoLeak to map out the scoping structure of RTZen and to isolate memory leaks easily. These two solutions were essential and successful to make RTZen fast, predictable, and robust, as demonstrated by our measurements. As the largest known open-source RTSJ project, RTZen demonstrates that real-time Java and Real-time CORBA are maturing into viable technologies for DRE system development. Moreover, these solutions can be used to bring these qualities to any RTSJ application. More importantly, our work proves that these specifications can be combined into a single middleware architecture that combines the advantages of each. The result of achieving these goals is a predictable, efficient, customizable, and embeddable RTSJ implementation of CORBA.

Roundtrip Latency [microseconds]

1800 1600 1400 1200 1000 800 600 400

RTZen on jRate

TAO

Figure 10: Jitter of high-priority thread with RTZen/jRate vs. TAO

4.2

Experiment 2: Real-time Java vs. C++

References

Although we showed our real-time Java ORB to be much more predictable than a regular Java ORB, another important comparison is how predictable and efficient RTZen is compared to an C++ ORB such as TAO [9]. The testing environment, number of observations, and message sizes were the same as in Experiment 1. Experiment 2 tested whether high-priority threads show minimal jitter even when other, lower-priority threads are running. Two threads were run simultaneously: the first thread was run at the highest CORBA priority, while the second thread was run at the lowest. The low-priority thread performed a long operation, while the high priority thread performed a short action which would have to interrupt the lower priority thread. If the lower priority thread were to interfere with the higher priority thread executing, the jitter of the round-trip latencies would increase for the high-priority thread. As shown in Figure 10, RTZen’s jitter for the high-priority thread is equivalent to TAO’s (469 µs and 448 µs respectively). TAO was used to provide a benchmark for Real-time CORBA performance. RTZen is still slower than TAO; however, considering the overhead of RTSJ and Java VMs, as well as the relatively early stage of research and development of RTSJ, RTZen’s performance compares favorably to TAO. In summary, as expected, RTZen’s predictability was greater than that of JacORB (a standard Java ORB) and comparable to that of TAO (a real-time C++ ORB). In addition, RTZen’s performance was only somewhat slower than that of TAO and significantly better than JacORB’s.

[1] Apache Software Foundation. BCEL - Byte Code Engineering Library. http://jakarta.apache.org/bcel/manual. html, 2002. [2] E. G. Benowitz and A. F. Niessner. A patterns catalog for RTSJ software designs. In Lecture Notes in Computer Science, volume 2889, pages 497–507. OTM 2003 Workshops, November 2003. [3] Bollella, Gosling, Brosgol, Dibble, Furr, Hardin, and Turnbull. The Real-Time Specification for Java. Addison-Wesley, 2000. [4] T. Corp. TimeSys Linux GPL 4.1. www.timesys.com, 2004. [5] A. Corsaro and C. Santoro. Design patterns for RTSJ application development. In Lecture Notes in Computer Science, volume 3292, pages 394–405. OTM 2004 Workshops, October 2004. [6] A. Corsaro and D. C. Schmidt. The Design and Performance of the jRate Real-Time Java Implementation. In R. Meersman and Z. Tari, editors, On the Move to Meaningful Internet Systems 2002: CoopIS, DOA, and ODBASE, pages 900–921, Berlin, 2002. Lecture Notes in Computer Science 2519, Springer Verlag. [7] E. Gamma, R. Helm, R. Johnson, and J. Vlissides. Design Patterns: Elements of Reusable Object-Oriented Software. AddisonWesley, Reading, MA, 1995. [8] F. Pizlo, J. M. Fox, D. Holmes, and J. Vitek. Real-time java scoped memory: Design patterns and semantics. In 7th IEEE Int’l Symposium on Object-Oriented Real-Time Distributed Computing (ISORC 2004), pages 101–110, May 2004. [9] D. C. Schmidt, D. L. Levine, and S. Mungee. The design of the TAO real-time object request broker. Computer Communications, 21(4):294–324, April 1998. [10] I. Sun Microsystems. Tuning garbage collection with the 1.4.2 java[tm] virtual machine. 2003.

5 Conclusion As an alternative to garbage collection, RTSJ offers explicit control of memory management by introducing the ability to create scoped memory. This extra control has two important consequences. First, even though this scheme is still simpler than per-object memory management, it adds complexity to the design of real-time systems written in Java because of the tight restrictions imposed on cross-scope memory access. Second, explicit control of memory creates the potential to have memory leaks, and debugging memory leaks is tedious and error-prone. We developed two solutions to these problems while developing RTZen, an open-source real-time Java, Real-time CORBA 7

Proceedings of the 11th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA’05) 1533-2306/05 $20.00 © 2005 IEEE

Suggest Documents