Code Generation and Optimization for Java-to-C ...

Code Generation and Optimization for Java-to-C Compiler Youngsun Han, Shinyoung Kim, Hokwon Kim, and Seon Wook Kim Department of Electronics and Computer Engineering Korea University, Seoul, Korea

Abstract. Currently the Java programming language is popularly used in Internet-based systems, mobile and ubiquitous devices because of its portability and security. However, inherently its performance is sometimes very limited due to interpretation overhead of class files by Java Virtual Machines (JVMs) in software-manner. In this paper, as one of the solutions to resolve the performance limitation, we present code generation and optimization techniques for a Java-to-C translator. Our compiler framework translates Java bytecode into C codes with preserving Java’s programming semantics, such as inheritance, method overloading, virtual method invocation, garbage collection, and so on. Moreover, our compiler translates for in Java into for in C instead of test and jump for better performance. Our runtime library fully supports Connected Limited Device Configuration (CLDC) 1.0 API’s.

1

Introduction

The applications written in Java can be compiled into location-independent codes moving on the Internet and running on every platform. Java’s portability is achieved by compiling its classes into a distribution format, called a class file. The class files are interpreted and executed on any computer supporting the Java Virtual Machine (JVM). Java’s code portability, therefore, depends on both platform-independent class files and the implicit assumption that the full featured JVM is supported on every client machine. However, despite of the distinguished advantages over other programming languages, there are two shortcomings to use Java, i.e, the size of Java virtual machines and performance limitation due to interpretation in software-manner. The size of a full-featured JVM is too big to be used on small devices like mobile phones and PDAs. Due to the limited computing power resources on small embedded devices, a few different versions of JVMs have been proposed [1, 2], and therefore, the class files cannot be executed on all kinds of client machines. Examples of the full-featured JVMs are JVM from Sun Microsystems [3], Jikes RVM [4], and Kaffe [5], and an example of the partially-featured JVM due to the resource constraint is Java 2 Platform, Micro Edition (J2ME) [2]. The software interpretation and execution incur much higher runtime overhead than direct execution to use native codes. In order to alleviate the performance problem due to software execution, many methods have been proposed

such as just-in-time (JIT) and ahead-of-time (AOT) compilers. The just-in-time compiler [4, 6–9] converts a sequence of bytecode (a method) into native codes at runtime, dynamically links, and executes them. This approach has been widely used for the last years. The alternative method is to translate class files into C codes on offline [10–13]. However, one method has pros and cons comparing with the other in terms of performance and applicability. Static compilation provides high quality of optimizations and code generations that cannot be made by a runtime compilation like JIT. Therefore, it generates better quality of target codes, but it loses portability due to machine-specific output codes. However, this scheme may be useful in embedded systems, where execution time and code sizes are critically important factors in implementation. Also, the code generation and optimization schemes at runtime like JIT may introduce the potential overhead to the whole system on the resource limited environment. In this paper we present code generation and optimization techniques for a Java-to-C compiler, which translates Java bytecode into C codes while the generated codes preserve Java’s programming semantics such as inheritance, method overloading, virtual method invocation, garbage collection and so on. Also our compiler fully supports Connected Limited Device Configuration (CLDC) 1.0 API’s. Moreover, our Java-to-C compiler translates for in Java into for in C in order to get better performance because gcc compiler cannot optimize goto statement generated for for in Java by our earlier Java-to-C compiler, but it can apply various optimization techniques to for in C. Toba [11] is a system to generate standalone Java applications which were targeted for JDK 1.1. It has a bytecode-to-C translator and additional runtime libraries to support garbage collection, thread management and Java API. In [12], the Java-through-C compilation system for embedded systems has been developed. There are two major differences between our Java-to-C compiler and the others. Our compiler only generates good quality of codes, to which a backend compiler (ex. gcc compiler) can apply code optimization effectively1 The paper is organized as follows: we present the structure of our Java-toC compiler in Section 2, the runtime structures to conserve the Java features in Section 3, and the framework for C code emission from Java bytecode in Section 4. Also, In Section 5, we present the implementation of one optimization technique which generates for in C for for in Java instead of test and jump. Section 6 discusses issues about code generation and optimization with performance analysis, and finally Section 7 makes the conclusion.

2

Structure of Java-to-C Compiler

Our Java-to-C compiler is organized into four components: a Java decoder, a bytecode-to-C translator, and runtime libraries. Figure 1 represents a structure of the compiler. The Java decoder analyzes the class files, and generates class blocks 1

We are sorry that we could not compare the performance between our compiler and the others, since the other compilers are not available to the public. We compare gcj compiler for performance.

to maintain class information. The translator converts Java bytecode sequences into a sequence of C codes. Finally, the generated C codes are linked with runtime libraries to include routines for garbage collection, thread management, and Java CLDC 1.0 API in order to build executable codes.

Fig. 1. Structure of our Java-to-C compiler.

2.1

Java Decoder

A Java compiler compiles a Java file into several class files. After each class file is read into a Java decoder, it is translated into class blocks. The class block is optimized to maintain information such as fields, methods and a super class, which are certainly needed for the compilation steps. During the translation of a class file, all its associated class files are translated together. 2.2

Bytecode-to-C Translator

The compilation process of a bytecode-to-C translator is divided into two phases, a preprocessor and a translator. And the translation is performed for each method. The preprocessor profiles the whole bytecode sequence in a method in order to build up a control flow graph (CFG). The control flow graph is used to adjust status of a virtual stack across its parent and child basic blocks. The stack has several slots to store types of temporary variables and their information is used for naming variable by numbering scheme. It is well known that we can easily assign temporary variables to each stack slot since the operand stack naturally maintains the liveness information of temporary variables for bytecode execution inside the JVM [14].

In the second phase, each basic block of the graph is translated into C codes. Figure 2 represents the translation process. When we construct the CFG, we insert two more dummy basic blocks, a prologue and an epilogue for easy code generation in the beginning and at the end of each method block. The prologue maintains a structure of a symbol table for its enclosing method, and the symbol table is initialized properly. Similarly the epilogue finalizes the code emission of the enclosing method. The compiler traverses each basic block to execute its handler, which only emits the translated C codes.

Fig. 2. Overview of bytecode-to-C translation process.

The generated C program maintains the full features of the Java platform, such as inheritance, method overloading, virtual method invocation, and so on. For code generation, it has prototypes for Java runtime data structures, class initialization, method invocation, garbage collection and a main method to start up translated C programs. While the code generation is performed, the prototypes are copied into the generated C programs properly. 2.3

Runtime Libraries

Our runtime library consists of three components: Java native methods, thread packages and a garbage collector. Java platform [1, 2] includes several API’s which are able to invoke native methods. Unfortunately the implementation of

native methods depends on a target system. Therefore whenever the target system changes, native methods should be directly coded again. The current version of our Java-to-C compiler fully supports Connected Limited Device Configuration 1.0 (CLDC 1.0) API that includes native methods. In order to simplify the generated C codes, some runtime routines were added to runtime library. Also routines for garbage collection and thread management are included inside the runtime libraries.

3

Runtime Structures

When our Java-to-C compiler translates a Java application into C codes, it needs such runtime structures to conserve Java’s object-oriented features as inheritance, method overloading and virtual method invocation. For this purpose, the following features should be considered: naming convention, data layout, and method invocation. Java entities such as a class and a method are uniquely identified by their names and an additional hash-code suffix for avoiding any naming conflict in a global namespace of C program. The name of each Java method is also mapped to a different C name, therefore an additional Java feature such as method overloading is naturally supported. Java primitive types are translated into primitive C types of the same data size. For example, Java’s character type is mapped to an unsigned short type in C. Java objects and arrays are reference types that extend java.lang.Object class. The reference type is translated into a C pointer type. Each reference points to the runtime data structure for an object or an array in C. The data structure has a pointer to a common class structure which is constructed with the following three components: a class descriptor table, a methods table and a static variables table. The class descriptor table contains general information needed for all classes, and Table 1 represents its details. The method table contains several function pointers to the invoked actual methods. During the Java-to-C compilation, the pointers are overwritten according to the inheritance relation. Figure 3 illustrates how the inheritance is implemented. Each class has a common part for itself. Methods in class A will be copied into class B when it inherits the class A. And methods newly defined in class B are added. The overridden method i in class A will be replaced by the new one in class B. Java methods can be invoked in several ways according to how the methods are referenced. The static methods including constructors are always invoked without reference to a particular object but a class. Therefore the methods should guarantee that the class includes pointers to themselves that has been already initialized before invocation. Unlike the static methods, instance methods are referenced by a specific object, and it is determined by runtime symbolic link. Finally, when an interface method is invoked, our Java-to-C compiler performs exhaustive search to find the method that will be invoked. For simple implemen-

Table 1. Elements of a class descriptor table. Element int need init

Description The flag contains whether the class was already initialized or not. int flag Contains access information of the class. int instance size The byte size of the class. j class super Pointer to the parent class. j class array class Pointer to the array class of the class. j class elem class Pointer to the element class, if the class is an array class. ihash *method hash Contains the hash codes for each method. int method num The number of methods in the class. void (*static const )() Pointer to the static class initializer. void (*def const )() Pointer to the default constructor. void (*finalize )() Pointer to the function for finalization.

Fig. 3. Example of inheritance between classes.

tation, we substitute a qualified table-lookup for the scheme. Table 2 shows the example how Java methods are referenced. Table 2. Java method invocation. Methods Kinds Scheme streamobj.print() static method print 208FF022() vectorobj.size() instance method ((java util Vector)a0) →vptr→size 00367B69(a0) threadobj.run() interface method ((void (*)(j object)) find interface(a0, 0x9205edb))(a0)

4

Code Generation

There are a few issues in code generation, such as bytecode translation, exception handling, garbage collection, and thread management. 4.1

Bytecode translation

During translation, the preprocessor splits the whole bytecode sequence into several basic blocks to construct a control flow graph (CFG) for a method. The control flow graph is used to compute the stack state. The liveness of temporary variables are properly maintained through computing the stack state. Because bytecode translation is performed in compilation time, the information of the simulated operand stack is useful for code generation. Beside the existing Java-to-C compiler such as Toba [11] to perform the computation for every instruction, our compiler system does only between basic blocks. The following operations are performed by a preprocessor for each method: 1. 2. 3. 4.

Find leaders for conditional and unconditional jump instructions. Split basic blocks using the information of leaders. Build up a control flow graph. Maintain the consistency of stack state between basic blocks.

After the preprocessor finds out basic blocks, bytecode-to-C translation is preformed for each basic block. The generated C codes for all basic blocks in a Java method are wrapped up in a switch statement like Toba [11]. 4.2

Exception Handling

The switch statement is used to handle Java exceptions. Some exceptions are specified to be thrown in Java virtual machine specification [15] when a certain

conditions are satisfied. These exceptions can be ignored according to the execution environment. The runtime program counter of the JVM is used to find the exception handler. The local variable pc is employed to mimic the program counter in the JVM. The variable is set at the beginning of every basic blocks. Additionally, C’s setjmp and longjmp routines are used to handle exceptions. 4.3

Garbage Collection

An automatic garbage collection is supplied in Java. If a certain objects are no longer referenced in a Java program, the objects will be de-allocated without any effort by a programmer. We use Boehm-Demers-Weiser conservative garbage collector [16]. 4.4

Thread Management

Our system has been targeted to support the Connected Limited Device Configuration (CLDC) 1.0 [17]. Because the only java.lang.Thread class is specified in CLDC API, complex runtime libraries for java.lang.ThreadGroup are not implemented. The Java runtime package for thread management is implemented using Linux PThread libraries. A thread can hold only a lock associated with a monitor which all objects have competed. Synchronization between threads is guaranteed via monitors.

5 5.1

Optimization Motivation

Figure 4 shows one of major Java methods in LU benchmark, and Figure 5 shows C output generated by our Java-to-C compiler and assembly output by gcc compiler. When the Java-to-C compiler translates Java for loop in forms of test and jump statements instead of for in C, gcc compiler sometimes does not assign a loop induction variable to a register. As shown in Figure 5, the loop induction variable Lvi14 is incremented by leal 1(%eax), %eax instruction. It incurs huge processor stalls. When the Java-to-C compiler generates for in C for for in Java, we could get about 20% higher speedup than before, since gcc compiler applies loop optimizations to for in C. Therefore, we design a state machine to identify for in Java bytecodes for generating for in C. Using gathered information, our compiler can translate for in Java in forms of for in C instead of test and jump. 5.2

Detection of for loop in Java bytecode

Our compiler detects Java for loops by using a state machine to recognize the following code sections in bytecode sequences: an initialization of an induction variable, a loop bound, a modification of an induction variable, and a backward jump instruction which is the end of for loop. The state diagram is shown in Figure 6. This state machine can recognize several for loop patterns in Java bytecodes, and the recognizable patterns are shown in Table 3.

Table 3. for loop patterns which our compiler can recognize. for(A = B; A < C; A = A+D or A++) {· · ·} Part of for loop Code in C Description Structure of Java bytecode Initialization of A = B Value initializing iconst, fconst, lconst, dconst, bian induction an induction vari- push, sipush, iload, fload, lload, variable able dload, getstatic, invokestatic, aloadarraylength, aload-getfield, aloadinvokevirtual, aload-invokespecial, aload-invokeinterface store istore, fstore, lstore, dstore Comparison part A < C Load an induction Iload, fload, lload, dload variable Variable com- iconst, fconst, lconst, dconst, bipared with an push, sipush, iload, fload, lload, induction variable dload, getstatic, invokestatic, aloadarraylength, aload-getfield, aloadinvokevirtual, aload-invokespecial, aload-invokeinterface Conditional If icmpeq, if icmpne, if icmpge, jump(compare) if icmpgt, if icmple, if icmplt Body {· · ·} Body of for loop Various instructions can occur Modification of an A = A + D Load an induction iload, fload, lload, dload induction variable variable Modification add or other various kinds of arithmetic and logic operations. Store the updated istore, fstore, lstore, dstore induction variable A++ Autoincrement iinc

public static int factor(double ad[][], int ai[]) { .... for (int l = 0; l < k; l++) { ... if (l < k - 1) { for (int k1 = l + 1; k1 < j; k1++) { double ad2[] = ad[k1]; double ad3[] = ad[l]; double d3 = ad2[l]; for (int i2 = l + 1; i2 < i; i2++) ad2[i2] -= d3 * ad3[i2]; } } } return 0; }

Fig. 4. Example of Java code in LU.

Initialization of an induction variable In the state machine of Figure 6, state q0 is an initial state and state accept is a final state. In Java bytecode sequences, an induction variable can be either 1) a local variable or 2) a member field or a method of an object. In Case 1, bytecodes such as iconst, bipush, sipush, load, invokestatic, getstatic and so on, are used to initialize an induction variable. In Case 2, aload appears first, and then arraylength, getfield, invokespecial, invokeinterface, or invokeivirtual follows. If a compiler detects either case, the state machine goes to state q1. When the state machine is in state q1 and the following instruction is istore, the state machine moves into state q3. This istore instruction stores the initialized induction variable into a memory. Thus, when the state machine reaches state q3, it decides that a code section of an induction variable initialization is found. Loop bounds In Java bytecode, after initialization of an induction variable, a code section of a comparison part should appear according to a syntax of for in Java. If each condition for the syntax is not satisfied, the state machine is moved into the initial state q0. To compare an induction variable with a loop bound, iload instruction is performed to read an induction variable from a memory. At this time, the state machine moves from state q3 into state q4. After the induction variable is loaded from a memory, a loop bound variable is loaded. If the loop bound variable is a local variable, the state is moved to q5 from q4, and if the loop bound variable is a member field or a method of an object, the state is moved to q6 and q5 from q4.

_L229: pc = 229; i0 = Lvi14; i1 = Lvi2; if( i0 >= i1 ) goto _L257; _L235: pc = 235; a0 = Lva10; i1 = Lvi14; a2 = a0; i3 = i1; d2 = ((struct darray*)a2)->data[i3]; d3 = Lvd12; a4 = Lva11; i5 = Lvi14; d4 = ((struct darray*)a4)->data[i5]; d3 = d3 * d4; d2 = d2 - d3; ((struct darray*)a0)->data[i1] = d2; Lvi14 += 1; goto _L229; } _L257: ============================================= .L184: cmpl -204(%ebp), %eax movl $229, -188(%ebp) jge .L186 .L187: fld %st(0) fmull 16(%esi,%eax,8) fsubrl 16(%edx,%eax,8) fstpl 16(%edx,%eax,8) leal 1(%eax), %eax movl $235, -188(%ebp) jmp .L184 .L186:

Fig. 5. C and assembly outputs from the example Java code in Figure 4.

Fig. 6. State diagram to search for in Java bytecode.

Modification of an induction variable We define two patterns as modification of an induction variable in Table 3 for simple experiment. In the first pattern, we should find a pair of iload and istore instructions that access the same memory location before goto statement for a backward jump (q7 → q8 → q9). The kinds of integer arithmetic and logic operations between these two instructions are not important. All we need are the start and the end of a code section of modification of an induction variable. In the second pattern, the start and the end of a code section of modification of an induction variable are the same. When iinc instruction appears, the state machine moves from state q7 into state q9. The only condition that iinc is an instruction for a code section of modification of an induction variable is that the iinc instruction should be followed by goto instruction for a backward jump. A backward jump When the state machine is in state q9, it can be said that all needed information about for loop is found. Therefore, if the next instruction is goto that makes a backward jump to the start of a code section of a comparison part, the state machine goes into a final state. In order to justify a correct backward jump, it is checked whether a target address of the conditional jump is the next instruction of the backward jump. Nested loops In order to detect nested loops efficiently, we used a loop stack as a data structure. Whenever the state machine reaches to state q7 (after detecting an induction variable and a loop bound), all collected information is pushed into the loop stack and the state machine returns to state q0. In this way, our state machine can find all the front parts of loop patterns we define while scanning the whole code once. When reaching to the end of a whole code, the stack has the information about front parts of all the loops we find. The state machine pops from the stack one by one and returns to state q7. The pc also goes back to the address of the conditional jump given by the popped information and the state machine starts checking whether those front parts of the loops have proper instructions for modification of an induction variables and backward jump. 5.3

Code Emission

Figure 7 shows the Java source code and Java bytecode of an example of for loop patterns accepted by our state machine. And Figure 8 shows our compiler’s code generation without recognizing Java for loops. Our compiler is able to recognize a for loop in Java bytecodes and translates it into for in C. Figure 9 presents the result obtained by applying this optimization technique to Java bytecode in Figure 7. For our experiment we implemented a simple case in Table 4. Especially, when a member field or a method of an object is used in code sections of initialization of an induction variable and a comparison part, translating for in Java into for in C is complicated due to an exception handling. Thus, we selected simple for

for(int l1 = 8; l1 < i; l1++) k+= l1; ============================= 52:bipush 54:istore 56:iload 58:iload_3 59:icmpge 62:iload 64:iload 66:iadd 67:istore 69:iinc 72:goto

8 10 10 75 4 10 4 10 56

1

Fig. 7. Java source code and Java bytecode of an example of for loop. i0 = 8; Lvi10 = i0; _L56: i0 = Lvi10; i1 = Lvi3; if( i0 >= i1 ) goto _L75; _L62: i0 = Lvi4; i1 = Lvi10; i0 = i0 + i1; Lvi4 = i0; Lvi10 += 1; goto _L56;

Fig. 8. Code generation using test and jump. i0 = 8; Lvi10 = i0; _L56: for( ; Lvi10 < Lvi3 ; Lvi10 += 1 ) { _L62: i0 = Lvi4; i1 = Lvi10; i0 = i0 + i1; Lvi4 = i0; }

Fig. 9. for in C generated from for in Figure 7

loop patterns to be translated into for in C. Moreover, these patterns are the most frequently used patterns in codes. So, before generating for in C for all the for in Java, we checked the performance improvement of translating these patterns into for in C. Table 4. Java for loop patterns to be translated into C for. for in Java translated into for in limited Java bytecode for(A = B ; A < C ; A++) A = B iconst, fconst, lconst, {...} dconst, bipush, sipusy, iload, fload, lload, dload getstatic, invokestatic, aload-arraylength, aload-getfield, aload-invokevirtual, aload-invokespecial, aload-inokeinterface. istore, fstore, lstore, dstore A < C iload, fload, lload, dload iload, fload, lload, dload if icmpeg, if icmpne, if icmpgt, if icmpge, if icmplt, if icmple ... Anything can be here A++ linc Goto

6 6.1

for in generated C a=b;

for(; a c; a+=1) { ... }

Eliminate ’goto’.

Performance Evaluation Methodology

The performance of our Java-to-C compiler has been tested using Java SciMark 2.0 benchmarks [18] on Zeon 2.0 GHz processor and 256 MB of memory with Redhat Linux 8.0, and compared with gcj. The generated C codes were compiled by gcc 3.2 C compiler. The SciMark [18] is a composite Java benchmark measuring the performance of numerical codes occurring in scientific and engineering applications. The benchmark tests how well the underlying systems such as JVM, JIT, and Java2C perform on the applications. The benchmark consists of the following five applications: Fast Fourier Transform, Gauss-Seidel relaxation, Sparse matrix-multiply, Monte Carlo integration, and dense LU factorization. Table 5 summarizes the applications briefly.

Table 5. Java SciMark 2.0 Benchmark. Application Description FFT Fast Fourier Transform exercises complex arithmetic, shuffling, nonconstant memory references and trigonometric functions. SOR Jacobi Successive Over-relaxation exercises typical accesses patterns in finite difference applications, for example, solving Laplace’s equation in 2D with Drichlet boundary conditions. Monte Carlo Monte Carlo integration exercises random-number generators, synchronized function calls, and function inlining. SparseMM Sparse matrix multiply exercises indirection addressing and non-regular memory references. LU dense LU matrix factorization exercises linear algebra kernels (BLAS) and dense matrix operations.

6.2

Performance Comparison

The relative speedup to gcj is shown in Figure 10. Our Java-to-C compiler shows lower performance than gcj for all applications except SOR. Especially, in the case of Monte Carlo Integration, the overhead for synchronization in a single-threaded application makes that our system has worse performance than gcj. In our compiler, even if an application is single-threaded, a monitor locking is enabled. Toba [11] can reduce the synchronization overhead, since the actual monitor locking is delayed until more than one thread are created. However the execution time is greatly improved by turning-off an exception handling. The gcj compiler optimizes an exception handling by using aggressive optimizations. Moreover, by translating for in Java into for in C, we can get faster execution time and see the possibility of further improvement in speedup. We could achieve 6% in SparseMM and 4% in Monte Carlo more speedup in loop code generation than test and jump. Such a little improvement is due to the limitation of loop patterns to be translated into for in C in our experiment. Our Java-to-C compiler generates for in C when modification of an induction variable is performed only by iinc statement. However for loop recognized by our compiler and translated into for in C is not the critical loop for performance in FFT, SOR and LU applications. In the case of FFT, we could get much better performance when all major for loops are changed into for in C. For the experiment we found out major loops of each benchmark program, and then translated all the Java for loops in for C by hand if our compiler did not translate them. The most critical method of FFT is transfor internal and it has four for loops. Our Java-to-C compiler translates one of them into for in C. Figure 11 presents the performance improvement when other loops are translated in for in C by hand. When all the for loops are translated properly, the performance gets better up to 24% than test and jump code generation, and 5% faster than that gcj. In the case of LU, the result is somewhat different. Even though all the for loops in the most critical method of LU, factor were changed into for in C

Fig. 10. Java SciMark 2.0 speedup.

Fig. 11. Performance improvement of FFT when other loops are translated.

by hand, the performance was not improved. Therefore, we conclude that for loops in factor of LU are not an important factor in performance and loop optimization provided by gcc compiler is not very helpful in improving the performance. In the case of SOR, we found out that all the loops in the most critical method, execute, are translated in for in C already by our compiler. Therefore, we can conclude that in SOR, for loops are not the determining factor in overall performance.

7

Conclusion

Several approaches to improve the performance of Java applications are still being performed. Especially, JIT compiler has been treated as the best solution for the problem. And, many researches are actually concerned in it. In addition to the JIT compiler, the Java-to-C compiler is an alternative solution. But, the Java-to-C compiler has not been popularly developed to resolve the performance issues because of implementation complexity. In this paper, we presented the structure of the Java-to-C compiler to preserve the Java semantics, and discuss code generation and optimization issues. we presented the implementation of translating for in Java into for in C in order to make performance better. Because of limitation of loop patterns that can be translated for in C, we could not get as high performance as expected. However, by changing the rest for loops into for in C, we can discover the possibility of further improvement. The generated C codes include many pointer and complex expressions, which prevent AOT compilers from applying advanced compiler optimization techniques like constant propagation, sub-expression elimination, inlining, and so on. In ongoing research, we have developed an IR framework between bytecode and C codes for helping AOT compilers generate better quality of codes and also have made our compiler generate for in C for more broad range of for loop patterns in Java source code.

References 1. 2. 3. 4.

5. 6. 7. 8. 9.

Java 2 Platform, Standard Edition. http://java.sun.com/j2se/. Java 2 platform, micro edition. The Source for Java Technology. Sun Microsystems. David Bacon, Stephen Fink, and David Grove. Space and time efficient implementation of the Java object model. In European Conference on Object-Oriented Programming (ECOOP2002), pages 10–14, Malaga, Spain, June 2002. Sadaf Mumtaz and Naveed Ahmad. Architecture of Kaffe. http://www.kaffe.org. Tya. http://sax.sax.de/˜ adlibit/. shuJIT. http://www.shudo.net/jit/#docs. OpenJIT. http://www.openjit.org/. Latte: An open-source java virtual machine and just-in-time compiler. http://latte.snu.ac.kr.

10. Guide to GNU gcj. http://gcc.gnu.org/java/index.html. 11. Todd A. Proebsting, Gregg Townsend, Patrick Bridges, John H. Hartman, Tim Newsham, and Scott A. Watterson. Toba: Java for applications: A way ahead of time (WAT) compiler. In Proceedings of the 3rd USENIX Conference on ObjectOriented Technologies and Systems (COOTS97), June 1997. 12. Ankush Varma and Shuvra S. Bhattacharyya. Java-through-C compilation: An enabling technology for java in embedded systems. In Design Automation and Test in Europe (DATE03), Paris, France, February 2004. 13. Gilles Muller and Ulrik Pagh Schultz. Harissa: A hybrid approach to java execution. IEEE Software, pages 44–51, March 1999. 14. Byung-Sun Yang, Soo-Mook Moon, Seongbae Park, Junpyo Lee, SeungIl Lee, Jinpyo Park, Yoo C. Chung, Suhyun Kim, Kemal Ebcioglu, and Erik R. Altman. Latte: A java VM just-in-time compiler with fast and efficient register allocation. In IEEE PACT, pages 128–138, 1999. 15. Tim Lindholm and Frank Yellin. The Java Virtual Machine Specification. AddisonWesley, 1996. 16. Boehm-Demers-Weiser conservative garbage collector. http://www.hpl.hp.com/personal/Hans Boehm/gc/. 17. Connected Limited Device Configuration (CLDC). http://java.sun.com/products/cldc/. 18. SciMark 2.0. http://math.nist.gov/scimark2/.

Code Generation and Optimization for Java-to-C ...

Code Generation and Optimization for Java-to-C ...

Suggest Documents

Performance Optimization and Code Generation from Customized ...

Next-generation acceleration and code optimization ... - BioMedSearch

Generation and Optimization of Code using Coxeter Lattice Paths

PRODUCTION CODE GENERATION AND

AUTOMATIC CODE GENERATION FOR ... - CiteSeerX

Code Generation for Simulation and Control ...

Code Generation for Event-B

Code Generation For Distributed Systems

code generation for embedded processors

configuration code generation and optimizations for ... - CiteSeerX

Revisiting Communication Code Generation

Code Generation = A +BURS

Object code optimization - Ece.umd.edu

Improving Code Generation for Associations ... - WordPress.com

Auto-vectorization through code generation for ... - People.csail.mit.edu

Automatic Code Generation for Synchronous ... - Semantic Scholar

Code Generation for Safety-Critical Systems - CiteSeerX

Domain-Specific Modeling For Full Code Generation

Code Generation Methods for Tiling Transformations - Computing

CODE GENERATION FOR THE MPEG RECONFIGURABLE VIDEO ...

Automated Code Generation for Lattice Quantum Chromodynamics ...

Code Generation Strategies for Partitioned Systems - LIP6

A Code Generation Approach for Auto ... - People.csail.mit.edu

Model based code generation approach for fast