SUMA: A Scienti c Metacomputer 1 Introduction - Semantic Scholar

SUMA: A Scienti c Metacomputer Emilio Hernandez

Yudith Cardinale Alejandro Teruel

Carlos Figueira

Universidad Simon Bolvar Abstract

The development of a metacomputing system for execution of Java bytecode, with additional support for scienti c computing development, is addressed in this article. The goal is to extend the Java Virtual Machine model, by providing both access to distributed high performance resources and execution of native scienti c functions. suma currently executes three kinds of code: sequential Java bytecode, parallel Java bytecode (communicating with mpiJava ) and suma native code, which includes standard mathematical libraries like Lapack and PLapack.

1 Introduction There is increasing interest in using Java as a language for high performance computing [1]. Java provides a portable, secure, clean object oriented environment for application development. Recent results have shown that Java has the potential to attain the performance of traditional scienti c languages [2, 3, 4]. On the other hand, the access to distributed high performance computing facilities through a metacomputing system has recently gained considerable acceptance [5]. A metacomputing system allows uniform access to heterogeneous resources, including high performance computers. This is achieved by presenting a collection of dierent computers as a single virtual machine. We address the development of a metacomputing system for Java programs, called suma (Scienti c Ubiquitous Metacomputing Architecture). The goal is thus to extend the Java Virtual Machine model Departamento de Computacion y T. I., Universidad Simon Bolvar, Apdo. 89000, Caracas 1080-A, Venezuela. Contact [email protected]

to provide seamless access to distributed high performance resources.

suma can be described as a \Datorr" project (Desktop Access to Remote Resources) [6]. suma middleware is object oriented and built on

top of CORBA, a commodity communication platform. Other wellknown metacomputing projects, like Globus [7] and Legion [8], provide their own communication platform. Using a standard communication platform allows us to bene t from a variety of implementations. We focus on the development of the distributed services that comprise the middleware layer and on the design and development of easyto-use clients. Apart from a modi ed java command, for remote execution, we developed commands for batch, o-line processing, and interactive clients like a scienti c calculator with matrices as operands. suma executes three kinds of code: sequential Java bytecode, parallel Java bytecode (communicating with mpiJava [4]) and suma native code, which currently includes standard mathematical libraries like Lapack [9] and PLapack [10]. The rest of the document is organized as follows. Section 2 introduces the models underlying the execution of programs in suma. Section 3 shows dierent ways users can interact with suma. Section 4 describes some experiences with suma. Finally, in section 5 we address some conclusions and future work.

2 Execution of programs in SUMA This section describes the execution model that any client of suma has to follow in order to execute Java programs and the environment on which these programs are executed.

2.1 ExecutionUnit ExecutionUnit

Code

Data

Nodes

Figure 1: ExecutionUnit

Capability

A program in suma is represented by an object called an ExecutionUnit. An ExecutionUnit ( gure 1) is an aggregate of the following components: Code : a collection of SCode objects that can be executed. An SCode (single code) object can be a sequential program (e.g., a Java class), a parallel program (a Java parallel application) or a native code (for instance, a Lapack routine). Data : a collection of data structures loaded by the user or generated by the programs. Nodes : a representation of the hardware on which the programs are executed, only visible to expert users and administrators. This object includes information on resource usage and performance. Capability : a capability list, which codes the permissions for using suma resources. This object is not visible to the users.

2.2 Execution Model

The basics of executing Java programs in suma are simple. The client machine executes load methods and execution methods. The class les are loaded in the Code component of the ExecutionUnit. It is possible to postpone loading a class to runtime1. Additionally to loading classes, some of the data les can be loaded in the Data component of the ExecutionUnit. Any data structures dierent from les used by the class les (e.g. instances of suma native classes SUMA Vector or SUMA Matrix ) must be loaded in Data. After the execution of the static time actions, an execution method is invoked, initiating the execution of the ExecutionUnit. At run-time the class les that were not initially loaded, are loaded on demand from the client machine. The read and write operations on the data les that were not initially loaded in the Data object are routed to the client machine. This mechanism is useful if the data le is very big and the program needs only a few bytes from it. Output les can be created in the Data object in order to avoid excessive I/O operations between the server and the client machine. Under the scheme described above we have two extreme cases. On the one hand the user only speci es the main class in the ExecutionUnit. The rest of the classes and data les are read from the client at run-time, on demand from the executing class. This scheme is communication intensive. On the other hand the user can load all of the In this case, at least a single reference to the main class must be passed, which will be loaded on starting the execution. 1

class les and data les in the ExecutionUnit before invoking the execution method. The output les are stored in the Data object during the execution phase. After the execution has nished, the user can read the data structures (including les) contained in the Data object. Between these two extreme cases, the user can select the combination of pre-loading and run-time loading that (s)he prefers. The ExecutionUnit exists until explicitly deleted. In this way, further class les loads, data le loads and execution actions can be performed. A simpli ed scheme of the activities involved in executing a program is depicted on gure 2. A user invokes execution of a program through the modi ed java command SUMA java, which starts Client. This Client construct an ExecutionUnit and passes it to the SUMA Engine. Then, SUMA Engine gets a server from the scheduler component (not shown on the gure) and contacts the ApplicationServer on this server, who will initiate the program execution. The ApplicationServer communicates with the Client to dynamically load classes and, depending on the execution mechanism used, to read/write les on the client. Note only some of the suma components are shown on the gure. ApplicationServer

...

ApplicationServer

Allocate ExecutionUnit 2

Dynamic classes and data access SUMA_Engine

3

Load ExecutionUnit 1

Client

Figure 2: Basic activities involved in executing a program in suma

2.3 Execution Environment

Any Java class le that runs on a Java Virtual Machine also runs on suma. Additionally, a suma ApplicationServer always contains a number of classes that can be imported by any Java program executing in suma. Native classes contained in the Data objects must be explicitly declared and used. Sequential ApplicationServers contain wrappers for BLAS and Lapack and will contain a larger set of native packages. Parallel ApplicationServers contain mpiJava and a wrapper for PLapack.

3 Interaction Models The objects de ned in suma are accessed through dierent interaction models, that we call views. These views are abstractions built on top of visible suma objects. The views are designed according to the user type or role. A non-expert user will see a single virtual machine, while the administrator view gives the necessary abstractions to con gure suma, from user registration to installation of new services. Other views are also possible, for instance, a view that allows an expert user to do more specialized performance optimizations. The views are supported by a single underlying object model. In this paper, we focus our attention on the non-expert user view. While suma will expose a greater number of details (mainly for tuning purposes) to an expert user, we are more interested in providing the right abstraction through the non-expert user view. An authorized user can open a session in suma from a workstation, by creating an ExecutionUnit. The client machine becomes part of suma while the session is open. There are three main ways of interacting with suma. First, the user can open a suma session from a program executing on the local workstation. The program can then invoke any of the suma native services listed in the library catalog, which includes commodity scienti c code. Second, the user can supply a Java bytecode for execution in suma. The component SUMA Scheduler chooses the server on which the code is actually executed. The Java bytecode can include calls to the mpiJava package. The third option is actually a combination of the rst two options. The user can supply a Java bytecode that invokes suma native services. The suma object model accommodates a number of basic mechanisms needed to support the aforementioned views.

4 Experiences with SUMA We designed some experiments to evaluate the alternatives related to pre-loading and dynamic loading of data structures. We built a small scale testbed to simulate a suma environment. We used a prototype of suma built on top of JacORB[11], a freely available CORBA implementation. The hardware on which the experiments were conducted consists of several Sun Ultrasparc 1 and a Sparc Classic connected with Ethernet. The Ultrasparcs played the role of high performance resources (servers) and the Sparc Classic simulated a low performance, desktop, client machine. SUMA Engine runs in one of the Ultrasparc, and one ApplicationServer on each of the other Ultrasparcs. The SUMA java client runs on the Sparc Classic. 300 Local Matrices in DATA Matrices in local files 250

Time (seconds)

200

150

100

50

0 0

10

20

30

40

50

60

70

80

90

100

Matrix Dimension

Figure 3: Matrix Multiplication execution time on SUMA The experiment consists in running a matrix multiplication Java program on suma, varying the matrices size up to 100x100. We want to compare the following three cases: Local: Running the program entirely locally (on the client). This situation stands for a machine out of suma. Matrices in Data: The ExecutionUnit contains both input matrices (Data ), and a reference to the main class (Code ). Matrices in local les: Same as before, but instead of passing along the matrices in the ExecutionUnit, only a reference to the les at the client machine are given. The results are shown on gure 3. As expected, the suma versions on a more powerful machine soon outperform the local version of the

matrix multiplication program. The version that loads the matrices in the ExecutionUnit is faster than the version which loads the matrices at run-time, on demand. The latter version implies a remote access for every matrix element access. The communication overhead explains the large dierence (a factor of about two) with respect to the \Matrices in Data " version, which loads both matrices on the remote node in a single communication operation. Even though the total number of transferred bytes is the same in both cases, pre-loading is much more ecient because it is done in a single communication operation. As the dynamic load version is necessary for cases in which a data structure is large and the program only needs a small portion of it, we plan to add caching in order to improve dynamic remote le accesses.

5 Conclusions and Future Work In this work, we address the development of a metacomputing system for execution of Java bytecode, through the de nition of an object oriented interface and the ecient implementation of metacomputing services, for scienti c and engineering application development. Tuning of clients and the middleware, as well as the design of other views of the metasystem, like the administrator view, are goals of future work in this project. More details on the suma project can be found at http://suma.ldc.usb.ve.

Acknowledgments

We thank Alana Aguirre, Luis Berbn, Roberto Bouza, Pedro Garca, Hector Rodrguez, and David Torres, who collaborated in the prototype implementation and the experiments.

References [1] Java Grande Forum Report: Making Java work for high-end computing. Technical Report JGF-TR-1, Java Grande Forum Panel, 1998. [2] Jose Moreira. Closing the performance gap between Java and Fortran in technical computing. In Java for High Performance Computing Workshop, Europar 98, 1998. [3] V. Getov, S. Flynn-Hummel, and S. Mintchev. High-performance parallel programming in Java: Exploiting native libraries. In ACM 1998 Workshop on Java for High-Performance Network Computing, 1998.

[4] Mark Baker, Bryan Carpenter, Sung Hoon Ko, and Xinying Li. mpiJava: A Java interface to MPI. In First UK Workshop on Java for High Performance Network Computing, Europar 98, 1998. [5] Mark Baker and Georey Fox. Metacomputing: Harnessing informal supercomputers. In Rajkumar Buyya, editor, High Performance Cluster Computing: Architectures and Systems, volume 1, pages 154{186. Prentice Hall PTR, 1999. [6] Gregor von Laszewski. Desktop Access to Remote Resources. http://www-fp.mcs.anl.gov/~gregor/datorr. [7] I. Foster and C. Kesselman. Globus: A metacomputing infrastructure toolkit. The International Journal of Supercomputer Applications and High Performance Computing, 11(2):115{128, Summer 1997. [8] A. S. Grimshaw, A. Nguyen-Tuong, M. J. Lewis, and M. Hyett. Campus-wide computing: Early results using Legion at the University of Virginia. The International Journal of Supercomputer Applications and High Performance Computing, 11(2):129{143, Summer 1997. [9] http://www.netlib.org/lapack. [10] Philip Alpatov, Greg Baker, Carter Edwards, John Gunnels, Greg Morrow, James Overfelt, Robert van de Geijn, and YuanJye J. Wu. PLAPACK: Parallel linear algebra package. In Proceedings of the SIAM Parallel Processing Conference, 1997. [11] Gerald Brose. JacORB: A Java Object Request Broker. Technical Report B 97-2, Institut fur Informatik, Freie Universitat Berlin, 1997.

SUMA: A Scienti c Metacomputer 1 Introduction - Semantic Scholar

SUMA: A Scienti c Metacomputer 1 Introduction - Semantic Scholar

Suggest Documents

A Scalable DBMS for Large Scienti c Simulations - Semantic Scholar

Oslo Scienti c Computing Archive Optimizing C++ ... - Semantic Scholar

Original scienti c article

Scienti c work ow systems (Short Paper) - Semantic Scholar

Agent-mediated Electronic Commerce: Scienti c ... - Semantic Scholar

SUPERB Support for Irregular Scienti c ... - Semantic Scholar

XML-RPC Agents for Distributed Scienti c Computing - Semantic Scholar

SANDBOX: Accessing Scienti c Data through ... - Semantic Scholar

Object Databases for Scienti c Computing - Semantic Scholar

Knowledge-based Scienti c Discovery in ... - Semantic Scholar

Compiling Scienti c Code for Complex Memory ... - Semantic Scholar

Versatile Scienti c Documents Over-Extended ... - Semantic Scholar

Automatic Binding of Native Scienti c Libraries to ... - Semantic Scholar

SUPERB Support for Irregular Scienti c ... - Semantic Scholar

1. Introduction - Semantic Scholar

1 Introduction - Semantic Scholar

LinuxNOW: A Peer-to-Peer Metacomputer for the ... - Semantic Scholar

1 1. introduction - Semantic Scholar

1 1. introduction - Semantic Scholar

Implementing Signatures for C++ 1 Introduction 2 ... - Semantic Scholar

[1] A brief introduction - Semantic Scholar

Section 1: Introduction - Semantic Scholar

Contents 1 Introduction - Semantic Scholar

Abstract 1 – Introduction - Semantic Scholar