1 Blending Existing and Evolving Technologies

2 downloads 0 Views 214KB Size Report
Feb 5, 1998 - packages and legacy codes written in traditional languages such as C ... ceptance by the scientific community is the large amount of legacy codes in exis- ..... distributed programming environment which extends beyond local.
Native-Language-Based Distributed Computing Across Network and Filesystem Boundaries BLENDING THE PORTABILITY OF JAVA WITH COMPUTATIONAL ENGINES WRITTEN IN NATIVE LANGUAGES TO ACHIEVE CROSS-NETWORK, HETEROGENEOUS, CONCURRENT COMPUTING.

Paul A. Gray and Vaidy S. Sunderam Emory University Dept. of Mathematics and Comp. Sci. fgray,[email protected]

February 5, 1998 Abstract

This paper discusses how the aspects unique to the Java programming language can be combined with complementary and unique aspects of other languages such as C and Fortran. This combining of the strong features of Java, such as portability and platform independence, with packages and legacy codes written in traditional languages such as C and Fortran results in a program blend which exhibits portability and speed not realizable by any of these languages individually. One area where this con uence of previously-disparate language features has strong potential is in the area of distributed, concurrent computing over heterogeneous platforms and across local network and lesystem boundaries | the setting addressed within this paper. Also addressed in this paper are the pivotal aspects of the Java bytecode representation of a class object which makes the porting of shared libraries across network boundaries, lesystems, and architectures possible.

1 Blending Existing and Evolving Technologies As a programming language, several e ective language-level features of Java lend to it being a highly-portable language. Many of these features are supported by the static representation of a Java program's executable state | its bytecode representation[8]. This encapsulation of a programming unit into a concise and compact, system independent form goes a long way toward Java's programming-side goal of \write once, run anywhere," where a programmer can write and compile a program on any platform and run the compiled version

of the bytecode on any other platform. However, the advancement in a Java program's portability is often at the cost of executional speed. While there are many ongoing projects which aim to promote Java as a high-performance language ([1] and [7] for example), its lacking execution speed is arguably a most signi cant obstacle which must be overcome. Another factor which limits any new programming language from wide acceptance by the scienti c community is the large amount of legacy codes in existence. \Legacy" code refers to scienti c programs already written in languages such as C, C++, and/or Fortran. Many scienti c packages and communication libraries exist as highly-optimized, system-dependent codes or libraries written in Fortran, C, or variants thereof (such as HPF[10] or HPC++[3] for example). Switching to a new programming language would possibly mean abandoning these familiar and established packages for less familiar and perhaps less-stable variants written in the newer language. Table 1 below is given to support to two distinct issues: 1. The scienti c community's reluctance to accept Java as a high-performance language for lack of speed, and 2. the potential for using Java as a front-end (a wrapper) to legacy codes and packages. Standalone Java-based Standalone C Implementation Java-wrapped LINPACK C-wrapped LINPACK engine (w/ Compiler Optimization) engine(w/ JIT) (w/ optimized C wrapper) Program (w/ JIT)

437.9 (103.4)

84.7 (24.4)

42.7 (41.2)

17.0 (16.9)

Table 1: Benchmarks (in seconds) under various implementations for the computa-

tion of multiplying two 500x500 matrices on a single Sun Sparc20 workstation. The quantities in parenthesis represent the best-e ort compiler/JIT optimizations of the respective implementations.

The data in Table 1 shows executional times for the multiplication of two 500  500 matrices under various implementations. The rst column shows a standalone Java implementation of the task at hand using the 1.1.4 release of the Java Development Kit (JDK); the second column a standalone C implementation. The last two columns show implementations of the matrix multiplication task which mix languages and utilize the well-established numerical codes found in the Fortran-based LINPACK blas1 [2]. The third column shows a Java-based wrapper which utilizes the LINPACK ddot subroutine to perform the columnrow dot products of the matrix multiplication. The nal column shows an equivalent implementation using C as the wrapper to the LINPACK ddot subroutine. Times given in parenthesis are best-e ort compiler optimizations of the implementations. While Table 1 lends strong support to the viability for using Java as a wrapper to legacy codes, it says nothing to the additional levels of portability 1

Basic linear algebra subroutines.

2

Data Input

Computational Processing

Post Processing

:= Mobile, Multi-Phase IceT Process

Figure 1: The IceT environment supports dynamic, multi-phased processes. achievable to the Java/LINPACK program collective2 derived from the Java front-end to the program collective. Thus, although one issue addressed in this paper is the extent one can hope to alleviate the two issues enumerated above, also addressed in this paper is the consequence of blending Java with existing languages in a manner which leverages the advantageous features of the respective languages in order to achieve program collectives with levels of portability and speed which cannot be achieved by any single language. To this end, the latter sections of this paper illustrate the advances in portability and speed enjoyed by Java/C/Fortran blendings as they relate to distributed, scienti c computing.

2 The IceT Sca olding The objective of the IceT project has been to support distributed, collaborative, and parallel computing across clusters of workstations existing on distinct networks and supported by individual lesystems. Within this fundamental framework, individual resource environments can be merged together or separated in order to facilitate multi-phase programming paradigms. For example, a program might begin its execution in a data-input phase, where internal data structures are lled using data stored on the local lesystem (phase 1) (see Figure 1). After the initialization phase, the program might carry out computationally-intense calculations on the data; computations which consume a considerable amount of system resources (phase 2). Upon completion of the calculation phase, the program might enter a data analysis phase, or a 2 A program collective refers to the primary class, the shared libraries appropriate for the intended architecture, and the complete collection of dependency classes.

3

data manipulation phase where the results of the computation are transformed into some visualization of the results or perhaps enter a phase whereupon the results of the computation are merged into an existing database (phase 3). The IceT environment aims to directly support the underlying environmental transitions, providing optimal process mapping onto the appropriate available computational resources. Perhaps initialization would take place on an older workstation which has been converted into a data-repository (phase 1 of the above example). Upon initialization, the processes and the data could move to a more high-performance computing environment such as an isolated cluster of Sun UltraSparcs (phase 2). Upon completion of the computations, the processes would perhaps move with the results of the computations to an environment suitable for data visualization | perhaps a Silicon Graphics database server (phase 3). IceT provides the underlying support required to establish connections and to maintain communication between resources on di erent networks. Less obvious is the need for an underlying mechanism which would be able to ferry processes across network boundaries and would be able to instantiate processes upon arbitrary, heterogeneous resources. To this end, IceT also manages the task of collecting and identifying the requisite components needed to create and migrate processes amongst heterogeneous computational resources. Thus, IceT aims at supporting program collectives which exhibit multiple states and phases by supporting a nomadic environment which is dynamically con gurable. The distributed computing paradigm directly supported is a message-passing synchronization amongst cooperating sequential processes which interact via explicit messages. This paradigm is modeled after the successful and established models of message-passing found in the PVM[4] and MPI[11] distributed computing environments.

3 Creating Remote Processes In some sense, the fundamentals involved in creating a remote process within the IceT environment share quite a bit of similarity with the manner in which a web browser loads and executes an applet referred to within a given .html document. The main task in creating a process is to locate the speci ed primary class and load in the non-local dependency classes when needed. Whereas a web browser is loading remote classes to a local resource, IceT extends the process creation paradigm to include the reverse in functionality | where the user speci es classes to be \soft-installed " on a remote host. In addition, as applets running within web browsers are signi cantly restricted in their scope and abilities, IceT processes are full applications, governed by the security restrictions placed upon them by the owners of the respective resources. Finally, one of the most distinguishing characteristics of an IceT process is in its ability to include portable native libraries in the program collective. Assuming that an IceT environment has been established using mechanisms described in [6], Figure 3 gives an illustration of how IceT manages creation of 4

Interface_0 A_Process Class_A

Package_1 Class_1_A libnative.so native.dll

Class_1_B

Figure 2: The static form of an IceT process is a Java-based \class" le which consists of interfaces, dependency classes and perhaps shared libraries.

No static representation of "A_Process" found in local search.

Create the process "A_Process" Request for "A_Process" static form

"A_Process" implements Interface_0, Class_A, Package_1.Class_1_A, and Package_1.Class_1_B

Bytecode for "A_Process" Request for "Interface_0"

Class_A has no outstanding dependencies

Request for "Package_1.Class_1_A" Package_1.Class_1_A has no outstanding

Bytecode for "Package_1.Class_1_A"

dependencies

Request for "Package_1.Class_1_B"

"RemoteHost"

IceT Environment

Request for "Class_A" Bytecode for "Class_A"

Resource to create process

ByteCode for "Interface_0

Package_1.Class_1_B requires a a shared library

Bytecode for "Package_1.Class_1_B" Request for shared library "native" for Architecture "arch" Binary form of shared library "native" for Architecture "arch"

Dependencies resolved, attempt instantiation

Figure 3: The soft-installation of the \A Process" class of Figure 2 at the request of some entity in the IceT environment (bottom) as a process on the speci ed host in the IceT environment (top). The sequence of events ows from left to right. a process from the perspective of the remote host where the process is to be created. Using the Java bytecode representation, the structure of the process to be created consists of various Interface implementations, class dependencies, and dependence upon architecture-speci c shared libraries as depicted in Figure 2. This remaining portion of this section details some individual steps involved when instantiating a process upon a remote host. The framework described below assumes that the computational resources involved are current members of the IceT resource pool, but not necessarily on the same network nor singly owned. The interesting case is when the remote resource does not extend privileges (no login, ftp, etc.) to the user actually requesting the creation of the process. On RemoteHost, an IceT daemon listens for messages. In this case, the IceT daemon on RemoteHost is contacted with a message requesting the creation 5

"RemoteHost" Bytecode for class "A_Process"

...1222EBABEFAC

222EBABE F A C

Dependency Class List

Native Library Dependencies

Methods Enumeration

Figure 4: The incoming bytecode representation for the class pending instantiation is dissected and analyzed to determine class dependencies, method invocations, and native library dependence. This analysis aids in detection of security violations and detection of native libraries appropriate to the architecture.

of the process \A Process." The RemoteHost daemon does a search for the A Process.class within the directories speci ed in the local \CLASSPATH" environmental variable. If A Process is found locally, it can be instantiated and executed | if permitted | in a straightforward manner. If the local search for A Process fails, then a signal indicating such is sent back to the requesting resource. The expected response is to receive the bytecode for the A Process.class. The static form of the incoming bytecode is then subjected to a thorough analysis of its bytecode representation. This analysis includes: 1. veri cation of the magic number (CAFEBABE), 2. noting of the major and minor versions of the compiler which generated the bytecode, 3. a reading of the constant pool count, and subsequently, 4. an internal reconstruction of the constant pool for the class3. The generation of the class' constant pool allows for continued and more detailed analysis of the bytecode. The analysis continues with an investigation of the subsequent references in the constant pool. This yields an enumerations of the implemented Interfaces, the elds, and the methods of the class. At this point, various characterizations of the methods used by the class can be determined. Methods such as System.load and certain methods of the 3 Refer to the JVM speci cation for the layout and de nition of the bytecode representation of a Java class le, [8].

6

or Socket class which might pose a security threat can be observed at this stage, and the local daemon can deny the request for process creation. However, if permission is given for the class to utilize native code, further analysis of the class' bytecode associated with each of the methods of the class is required to actually obtain the name of the shared library that is being wrapped or is required by this class4. Each of the class' method's bytecode is scanned for the JVM opcodes invoke static (0xB8) or invoke virtual (0xB6) associated with the Runtime.loadLibrary, System.load, or similar (and possibly obfuscated) calls which register the shared library with the JVM. The associated reference to the argument of the call | the actual name of the native library which will be loaded | is then obtained by reconstructing the speci c String argument found in the constant pool5 which is the passed as the argument to the load or loadLibrary call. File

 Note: Two issue that arises at this point which relate to where the native codes (shared libraries) should be obtained from are: \What should be done if the user requesting creation of the process can not produce a shared library appropriate for the architecture on where the process is to be created?" and \What should be done if complete trust for using native code on the local system is not given to the requesting party?" In these two situations, there is e ort to develop access to binary forms of precompiled shared libraries for standard scienti c packages such as those found on Netlib6 . Thus, a user running on the WindowsNT platform, for example, would be able to create processes on a remote SGI cluster which made calls to methods found in scienti c packages made available in shared library form for the SGI architecture through a trusted third-party repository.

It is imperative that the name of the library be obtained prior to any attempt at instantiating the class. Since this library might not exist on RemoteHost, the library would need to be obtained prior to attempting to instantiate the process as a Java object. One simple way of determining the name of the native library which this class depends upon would be to attempt to resolve the class, whereupon an Unsatis edLinkError would be thrown | specifying in its stack trace the name of the library which it could not link against. However, such an approach would be for naught. Intuitively, the missing library could then be obtained and placed in the library search path on RemoteHost, but the JVM speci cations mandates that all subsequent attempts to load in a library which failed to be loaded prior are to be ignored. For this reason, the name This further analysis is not obtainable through Java's Reflection class nor its pro ling information, and so it must be explicitly extracted using the methods' Opcodes and the references to the Constant UTF8 objects within the constant pool 5 Dynamic generation of shared library names is not permitted. Only \hardcoded" library names, or those which have appropriate references to Constant UTF8 objects in the constant pool, are allowed. 6 Netlib is a collection of mathematical software, papers, and databases. More information is available at http://www.netlib.org. 4

7

of the dependent shared library must be extracted from the static bytecode representation and the static form of the shared library must be located, placed or moved (if necessary) to a directory speci ed in the library search path of the environment under which the daemon of the RemoteHost is running. Once all the native libraries required by the class have been veri ed to be in an appropriate local directory, a list of the primary class' dependency classes is generated. Each dependency class is then, in turn, obtained similarly and analyzed accordingly for dependency classes, security violations and/or dependence upon native library calls. This complete analysis of the dependency classes di ers slightly from the manner in which a web browser behaves. A web browser will instantiate the primary class without having obtained the all of the dependency classes. The browser only obtains missing dependency classes when they are actually invoked or their methods accessed in the course of the processes' execution. This is bene cial to the browser in the sense that it will not waste a considerable amount of time obtaining and resolving classes in the dependency tree which will never be accessed within the course of the process execution. In the IceT setting, where dynamic merging and splitting occurs, communication to the host which served up the application might be severed during the execution of the process, perhaps the host was split apart from the virtual machine and is no longer reachable. Such a scenario would result in a fatal Unsatis edLinkError being thrown upon encountering the missing class, terminating the process. By obtaining all of the dependency classes in an a-priori fashion, one avoids this potential pitfall. Upon soft-installing the entire program collective upon RemoteHost, the primary class is instantiated and cast into a subclass of the Thread object. All IceT processes are sub-classes of the TaskProtocols class which de nes the message-passing (send, recv, etc.) and environmental con guration methods (add, merge, spawn, etc.). The TaskProtocols class itself is a sub-class of the Thread class. Thus, any IceT process also inherits properties from the class Thread. As such, the execution of the instantiated process may be started, stopped, suspended, re-prioritized, or, if the class implements the Serializable interface, may be serialized to disk for later execution or even to an alternate host within the combined resource pool. Reiterating, the analysis of the bytecode associated with the class described above is performed on the static representation of the class as opposed to the analyzing the class after its instantiation or during its execution. The governing of the process in these latter states can be managed by security policies enforced by the local SecurityManager. In concluding this section, it should be emphasized that the bytecode representation of a Java class le aids in characterizing the expected behavior of the instantiated form of the process. In dissecting the bytecode associated with the class le, one can detect use of methods derived from File, Socket, or any other methods that one might want to restrict a process from invoking on the resource. In addition, the actual JVM Opcode of the methods are used to extract the name of the shared library which the class might need to link against which enables the needed locating and local installation of the requisite library 8

prior to creating an instance of the process.

4 Extended Functionality At the present time, there are several projects underway which look to generate Java-bindings to existing scienti c packages such as MPI, ScaLAPACK, etc. [9]. Consequently, the focus in this section is not on the manner in which native packages can be wrapped to create Java bindings, but rather to demonstrate how the IceT environment can blend Java together with native libraries in a portable manner, and with unmatched computational speed. As was shown in Table 1, the executional speed associated with a pure Java implementation of the multiplication of two 500  500 matrices on a single workstation signi cantly lagged behind the comparable C-based implementation, while the blending of these languages with the Fortran-based LINPACK routines proved to give a signi cant speedups. The results shown in Table 2 show that commensurate speedups are also seen in the parallelization of the task over the IceT environment. Other parallelization benchmarks which compare IceT to PVM are given in [5]. # Processors Java-wrapped LINPACK engine (w/ Jit) in computation 1 41.2 4 25.60 8 11.95

Table 2: Benchmarks (in seconds) for distribution of multiplying two 500x500 matrices on a cluster of Sun Sparc20 workstations, using full optimizations.

Work-in-progress includes benchmarking computations across truly heterogeneous clusters (such as depicted in 1), where primary classes and systemspeci c native codes are ported and linked in a dynamic fashion across network and lesystem boundaries.

5 Summary The work shown in this paper gives support to the strong potential for a heterogeneous, distributed programming environment which extends beyond local networks and lesystems; supported by a blending of languages. Speci cally, Using Java as the primary wrapper to the program collective, and established scienti c languages such as C and Fortran for the computational engine of the collective results in portability and speed suitable for distributed computing. The use of Java as the primary element of the program collective allows for a complete static analysis of its bytecode representation, from which the native library dependence can be determined in addition to a characterization of the 9

class' use of methods which might pose a security risk to the local resource on which it is to be created.

References

[1] Carpenter, B., Chang, Y. J., Fox, G., Leskiw, D., and Li, X. Experiments with 'HP Java'. Concurrency, Experience and Practice 9, 6 (June 1997), 633{648. [2] Dongarra, J., Bunch, J., Moler, C., and Stewart, P. The FORTRAN{ based LINPACK routines. Available from NAG, Downers Grove, IL. [3] Gannon, D., Beckman, P., Johnson, E., Green, T., and Levine, M. HPC++ and the HPC++Lib Toolkit. Tech. Rep. White Paper, Indiana University, Nov. 1997. [4] Geist, G. A., and Sunderam, V. S. The PVM system: Supercomputer level concurrent computation on a heterogeneous network of workstations. In Proceedings of the Sixth Distributed Memory Computing Conference (1991), IEEE, pp. 258{261. [5] Gray, P., and Sunderam, V. IceT: Distributed Computing and Java. Concurrency, Experience and Practice 9, 11 (Nov. 1997), 1161{1168. [6] Gray, P., and Sunderam, V. The IceT Environment for Parallel and Distributed Computing. In Scienti c Computing in Object-Oriented Parallel Environments (New York, Dec. 1997), Y. Ishikawa, R. R. Oldehoeft, J. V. W. Reynders, and M. Tholburn, Eds., no. 1343 in Lecture Notes in Computer Science, Springer Verlag, pp. 275{282. [7] Hummel, J., Azevedo, A., Kolson, D., and Nicolau, A. Annotating the Java bytecodes in support of optimization. Concurrency, Experience and Practice 9, 11 (Nov. 1997), 1001{1016. [8] Lindholm, T., and Yellin, F. The Java Virtual Machine Speci cation. Addison Wesley, 1997, ch. 4, pp. 83{106. [9] Mintchev, S., and Getov, V. Automatic binding of native scienti c libraries to Java. In Scienti c Computing in Object-Oriented Parallel Environments (New York, Dec. 1997), Y. Ishikawa, R. R. Oldehoeft, J. V. W. Reynders, and M. Tholburn, Eds., no. 1343 in Lecture Notes in Computer Science, Springer Verlag, pp. 129{136. [10] Richardson, H. High Performance Fortran: history, overview and current developments. 1.4 TMC-261, Thinking Machines Corporation, Bedford, MA, Apr. 1996. [11] Snir, M., Otto, S. W., Huss-Lederman, S., Walker, D. W., and Dongarra, J. MPI, The Complete Reference. MIT Press, Nov. 1995.

10

Suggest Documents