Nomadic Metacomputing

0 downloads 0 Views 497KB Size Report
2.1 Common problems facing metasystems and Legion's proposed solutions ..... Legion 30], begun in 1993, is a metasystem software project at the University of ...
Nomadic Metacomputing

Patrick H. Fry

Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY 12180-3590, USA [email protected]

Ph.D Research Proposal Prepared for the Examining Committee

Dr. Dr. Dr. Dr. Dr.

Boleslaw K. Szymanski Mark K. Goldberg Shivkumar Kalyanaraman Joshua W. Knight Mukkai S. Krishnamoorthy

Rensselaer (Chair) Rensselaer Rensselaer IBM Research Rensselaer

Contents 1 Introduction and Overview

1

2 Previous Work

6

1.1 Motivation of Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.1 Metasystem Research Issues . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.2 Mobile Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.1 Legion . . . . . . . . . . . . 2.1.1 Programming Model 2.1.2 System Architecture 2.2 Globus . . . . . . . . . . . . 2.2.1 Programming Model 2.2.2 System Architecture 2.3 Charlotte . . . . . . . . . . 2.3.1 Programming Model 2.3.2 System Architecture 2.4 Javelin . . . . . . . . . . . . 2.4.1 System Architecture 2.5 Other Metasystems . . . . . 2.6 D'Agent . . . . . . . . . . . 2.6.1 Programming Model 2.6.2 System Architecture 2.7 Discussion of Previous Work

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

3.1 Programming Language and Execution Environment 3.2 Distributed Hierarchy . . . . . . . . . . . . . . . . . . 3.3 Mobile Clients . . . . . . . . . . . . . . . . . . . . . . 3.3.1 Docking . . . . . . . . . . . . . . . . . . . . . 3.3.2 Identi cation . . . . . . . . . . . . . . . . . . 3.3.3 Temporary storage . . . . . . . . . . . . . . . 3.3.4 Delayed response . . . . . . . . . . . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

3 Research Objectives and Approach

i

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

2 2 4

7 8 8 9 9 9 10 10 11 12 12 12 13 14 14 16

18 18 19 19 19 20 20 20

ii

Ph.D Research Proposal

4 Current Work and Preliminary Results

4.1 Twin Primes Distribution . . . . . . . . . 4.1.1 Hierarchical Resource Organization 4.1.2 Reliability . . . . . . . . . . . . . . 4.1.3 System Performance . . . . . . . . 4.2 Caching DHCP Services . . . . . . . . . . 4.2.1 Caching DHCP Relay Agent . . . . 4.2.2 Prototype Implementation . . . . .

5 Future Work and Conclusions 5.1 Metasystem . . . . . . . 5.1.1 Scalability . . . . 5.1.2 Load Balancing . 5.2 Mobility . . . . . . . . . 5.2.1 Client Relay . . . 5.2.2 Docking Station . 5.3 Applications . . . . . . . 5.4 Conclusion . . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

21 21 23 24 25 26 27 29

33 34 34 34 35 35 36 37 37

List of Figures 2.1 Matrix multiplication in Charlotte [4]. . . . . . . . . . . . . . . . . . . . . . . . . 2.2 The architecture of Agent Tcl[36]. (a) Core system four levels. (b) Support agents. 4.1 Potential primes and twins modulo 30. Multiples of 2, 3 and 5 are grayed out and potential twins are circled. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Communication paths between a farmer and its workers. . . . . . . . . . . . . . . 4.3 Segmentation of search space and overview of recoverability options. . . . . . . . . 4.4 Progress of twin primes computation (time in hours vs. number of intervals, 1 interval = 1010). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5 A \typical" DHCP network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6 Three stages of a DHCP client address lease. . . . . . . . . . . . . . . . . . . . . . 4.7 Sample interaction between client, CDRA, and server. . . . . . . . . . . . . . . . . 4.8 DHCP client state diagram for DHCP protocol[20]. . . . . . . . . . . . . . . . . . 4.9 Client state diagram used by CDRA . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Static Client Communication. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Nomadic Client Communication. . . . . . . . . . . . . . . . . . . . . . . . . . . .

iii

11 15 22 23 24 25 27 29 30 31 32 35 36

List of Tables 2.1 Common problems facing metasystems and Legion's proposed solutions [30]. . . . 2.2 Core Globus services [24]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

iv

7 9

Chapter 1 Introduction and Overview Up until about 10 years ago, high performance computing relied almost exclusively on supercomputers and massively parallel machines. However, the exponentially increasing processing power of relatively inexpensive PC's and workstations has made Network of Workstation (NOWs) a feasible and highly economical platform for high performance parallel computing [2]. Libraries, like PVM [43] and MPI [37], are now commonly used by the HPC community for cluster-based computing. Other NOWs or NOW libraries include [1, 9, 34, 38, 42, 49]. Another type of parallel computing that has recently become popular is web-based metacomputing. Metacomputing systems, also referred to as metasystems, are designed to use the idle CPU cycles of machines using the Internet as the interconnection network. While cluster-based systems may have tens or hundreds of nodes, metasystems are designed to support hundreds, thousands, or even millions of computers. Cluster-based systems are used for parallel execution of applications on a singly administered network (e.g., educational institution or corporation). Metasystems target long running applications of universal interest, like cryptography, mathematics and computational science [47]. There are already some interesting projects using computational resources over the Internet. Examples include Rensselaer's Twin Primes Project for analyzing twin primes distribution and Brun's Constant [28, 50], the distributed.net project for cracking RC5 and DES encryption keys [17], the GIMPS project for nding Mersenne Primes [29], and SETI@home for aiding in the Search for Extra-Terrestrial Intelligence (SETI) [46]. Like NOWs, metasystems are able to \ride the technology curve," quickly bene ting from technological developments in Commercial O The Shelf (COTS) components. Metasystems can take advantage of the latest in hardware (CPU, I/O), software (OS, compilers) and networking technologies[33]. Since metasystems are primarily designed to use the idle CPU cycles of machines on the Internet, the cost for hardware is cheap (or sometimes free!) from the perspective of the application developer. The increasing numbers and processing power of laptops and other portable processing devices is as yet an untapped resource for metasystems. Mobile computers are becoming increasingly popular because of the bene ts provided to users: allowing users to keep their data and applications with them at all times and to provide network and/or Internet access from nearly any location. However, mobile computers, by their very nature, do not have permanent network connections. They can be disconnected for long periods of time and when they are connected, their network 1

Ph.D Research Proposal

2

connection is often poor in terms of performance and reliability. In addition, a mobile computer might have a di erent name and IP address at each reconnect to the network. For example, at the user's home the mobile computer could be assigned a name and address by the user's ISP and then a di erent address when the user connects to the network at his/her oce at work. We propose an investigation into the research issues involved with integrating mobile computers in a metasystem. There are two major bene ts to such a merger. The rst obvious bene t is the addition of more processing power to a metasystem for faster computation. However, mobile computers need special treatment due to their unique network connectivity issues. The second, less obvious, but potentially very powerful bene t is the added capability of a mobile user to submit work to the metasystem at any time, regardless of network connection type or location. In this proposal we will discuss in detail the research issues involved. We will also outline our plans for constructing a \nomadic metasystem," a fully featured metasystem enhanced to support mobile computers. We will also discuss what types of applications would bene t from such a system. After a brief overview of the basic structure and issues involved with metasystems and nomadic extensions to metasystems, we present a short survey of existing metasystems and discuss some of their advantages and shortcomings. This will serve to motivate the high level description of the proposed research to explore new techniques for adding support for mobile systems to metacomputing. We then present some of our current work in the area of distributed systems, including a distributed twin primes application using hundreds of machines and adding caching functionality to the intermediate nodes, known as relay agents, of the Dynamic Host Con guration Protocol (DHCP)[20] network services. We conclude by stating the direction and goals for our future work in this area.

1.1 Motivation of Research The goal of this research is to explore new techniques in which mobile computers can be used in metacomputing. This involves exploiting technology from the disciplines of both metacomputing and mobile computing. Speci cally, we are interested in e ectively allowing mobile resources to be added to the computation pool of a metasystem and methods for mobile users to submit jobs to a metasystem transparent to network connection type or location. For a system of this type to be useful, it must provide the application developer a simple set of tools which hide the gory details of issues like the connection status of a mobile computer. In this section we present some of the research issues for metasystems and then provide a discussion of research issues in nomadic computing which need to be considered when adding these mobile devices to a metasystem. This serves as foundation for the remainder of the proposal.

1.1.1 Metasystem Research Issues

Because of the high latency and low end-to-end throughput of the Internet, metasystems are best suited for parallel applications in which each subcomputation needs only a limited or constant volume of data to initiate and/or synchronize execution and reports a constant amount of data as a result. For example, the results might be binary values for searches or a few numbers for factoring, enumerations and computation of moments of data distributions. Metasystems face

Ph.D Research Proposal

3

many challenges not normally present in a parallel execution environment. Below, we list the most important of these challenges.

Scalability

Metasystems are usually designed to support hundreds, thousands, or even millions of machines. These machines are interconnected using the Internet which is high latency and has low end-toend throughput. In this type of environment, a central management server would be too much of a bottleneck. A hierarchical, geographically dispersed, collection of servers is virtually required for ecient operation. However, management of these loosely connected servers is not an easy task.

Ease of Use

For metasystems, \ease of use" is important at multiple levels. First, it must be simple for \Joe User" to add his/her machine to the computation. End users who are donating CPU cycles will not do so if there are complex installation procedures or con gurations for the client portion of the metasystem. Ease of use also applies to applications development and submission. The metasystem should either support existing parallel languages or provide simple functions for basic operations as well as use object inheritance to hide the gory details of the implementation. The application developer should not have to worry about issues such as work replication, heterogeneous architectures or access control. An easy to use facility for submitting jobs to the system is also important. Finally, consider domain setup and administration. A system scalable to thousands or millions of nodes must have a collection of geographically dispersed servers instead of just one for performance and reliability. These servers will most likely need to be installed and con gured by administrators of the domain in which they are being placed. Metasystem designers must have sucient documentation and installation help to facilitate this process.

Execution time Metasystems are used to execute programs which may take months or even years to nish. Such a long run requires frequent state changes or checkpointing to ensure minimum loss of work in the event of a system failure (e.g., power outage or server shutdown).

Adaptive Parallelism

The set of participating machines will grow and shrink during the computation. The metasystem must be able to gather information about its current state dynamically and adapt its behavior appropriately at any time during execution, not just at its startup phase. The system must be able to identify a node failure and reassign the work to another participant.

Heterogeneity In order to get as many participating nodes as possible, most metasystems have been designed to run on multiple architectures. For this reason, many metasystems are Java based.

Ph.D Research Proposal

4

Security Unlike cluster-based computing, there is no single system administrator to manage all participating nodes in a metasystem. Users of the participating machines need guarantees that the metacomputing system will not make unauthorized changes or accesses to their machines (e.g., le and resource control). Issues include authorization, authentication, and encryption. Running the metasystem client as a Java applet in a web browser helps alleviate some of these concerns.

Incentives for Participation Getting end users to participate can be a dicult task. Currently most popular Internet based distributed computing projects provide a Web-page with statistics based on individual computers, groups, and architectures[48]. Another method is to include monetary awards for key contributions. For example, the owner of the computer which successfully breaks an encrypted message or the largest contributor could be given a cash award.

1.1.2 Mobile Computing

Mobile computers provide some interesting capabilities to metasystems. In addition to the added CPU power, mobile computers enable users to interact with a metasystem (e.g. add, delete, or modify jobs, monitoring, etc.) from any geographic location. There are some issues unique to mobile computers when considering them for participation in a metasystem.

Roaming and Identi cation

The same mobile computer may often be connected to the Internet through several di erent mechanisms including cellular or land-line modems, cable or satellite connections, and Ethernet or other high speed digital technologies such as T1 or T3. For each type the mobile computer might have a di erent host name, IP address and even di erent domain names (e.g. ISP access from home and company access from oce at work). This presents unusual requirements for providing unique and meaningful identi ers for each participating machine.

Network Reliability Di erent methods of network access provide di erent levels of reliability. A cable modem is generally more reliable than analog phone modem access which is usually more reliable than a cellular modem. All of these methods are less reliable than an Ethernet LAN. The metasystem must be robust enough to survive failures in communication between, or even during, network communication.

Disconnected Mode Mobile computers often run isolated from any network. Even in this \o -line" or disconnected mode operation, the mobile computer could be providing services to the metasystem. For example, the computer could be processing work and saving the results to be returned to the metasystem server upon reconnection to the network. A mobile user could submit a job to the system when

Ph.D Research Proposal

5

disconnected. This message would remain in a queue until a connection is established at which point the job is automatically transferred to the system. It is important that disconnected mode operation be as useful and similar as possible as when connected to the system.

Chapter 2 Previous Work There are currently many metacomputing environments being developed. However, we know of no system which supports both parallel applications and the use of mobile resources. In this section we present four metasystems: Legion [30], Globus [24], Charlotte [4], and Javelin [11]. We also mention D'Agent, a mobile agent system[36]. D'Agent provides a management environment for mobile resources using mobile agents. We are interested in mobile agent systems like D'Agent for the purposes of integrating mobile resources in metasystems. At this point there are two main types of metasystems being developed. The rst, at least historically, are those that rely on extensions to existing parallel programming libraries and cluster management systems like PVM [43], MPI [37], CC++ [10], LoadLeveler, and Condor [13]. Legion and Globus fall in this category. One major diculty for this type of system is maintaining current and correct binaries of the application codes. Each supported computer architecture needs its own set of binaries for each parallel application (and the underlying metasystem). Access control is another issue. Most of these systems require user access to each participating machine. Charlotte and Javelin fall in the second category: Java based metasystems. Up until recently, Java Virtual Machine (JVM) performance was a serious hindrance for computation intensive applications. However, recent improvements in the execution environment for Java applets (e.g., just-in-time (JIT) and dynamic compilation) allow for more ecient execution. The Java applet environment inside a web browser provides some \no extra cost" features which are extremely useful for metasystems, like automatic source dissemination, strict security, and support from all modern mainstream operating systems. There are some restrictions placed on Java applets which limit an environment for metasystems. Consider network communication. A Java applet, by default, is only allowed to communicate with the HTTP server from which it was downloaded. Direct peer-to-peer communication is not possible. This type of indirect communication must be passed through the server which can become a serious performance bottleneck. One option is to run the metasystem task as a Java application instead of running it as an applet, but then some other useful security restrictions, such as local le access, are lifted. Section 2.5 describes some other metasystems which may be of interest. All are similar to one or more of the four systems described below. These are not discussed in detail, but short descriptions and references are provided for interested readers. 6

Ph.D Research Proposal

7

Problem Writing parallel applications Multiple separate le systems Heterogeneous resources

Tools Available PVM, parallel C++, Fortran wrappers Federated le system for transparent le access Automatic scheduling, binary selection and migration, application speci c scheduling tools Multiple resource owners Owner control of resource consumption, detailed resource consumption accounting Debugging parallel programs dicult Post-mortem playback using o -the-shelf debuggers Host/network failures Automatic system recon guration and limited application fault tolerance Table 2.1: Common problems facing metasystems and Legion's proposed solutions [30].

2.1 Legion Legion [30], begun in 1993, is a metasystem software project at the University of Virginia. Legion is an ambitious project whose goal is nothing less than creating a \worldwide virtual computer" providing location transparent access to resources available in the system. Legion is still in the early stages of development. The designers of Legion have the following objectives:

Site autonomy: Legion will consist of resources spread across spheres of system control, each

system controlling the use of its resources by the Legion system (e.g.,, how often and when a resource is available to Legion and what kind of access is allowed). Extensible core: The core system must allow users to \construct their own mechanisms and policies to meet speci c needs." Scalable architecture: Must be scalable to millions of hosts - no centralized structures and completely distributed management. Easy to use, seamless computational environment: Compilers and run-time facilities should manage the environment for the user as much as possible. High performance via parallelism: Support for easy to use task and data parallelism. Single, persistent name space: A single name space for le and data access across the entire system. Security for users and resource owners: Provide mechanisms for users to manage their own security needs and do not weaken existing security mechanisms in the host operating system. Management and exploitation of resource heterogeneity: Legion must support interoperability between heterogeneous hardware and software components. Scheduling decisions and policies should be customized to each type of system.

Ph.D Research Proposal

8

Multiple language support and interoperability: Support for legacy codes and heteroge-

neous language application components. Fault tolerance: In such a large system, it is obvious that several participating hosts may be temporarily down or disconnected. Legion must dynamically recon gure itself in the presence of these failures with minimal loss of work. The rst prototype of Legion, the Campus Wide Virtual Computer (CWVC), was released in 1995. It consists of more than 100 workstations and an 18-processor IBM SP2. No other metacomputing project is as ambitious as Legion. The CWVC at present does not realize all the objectives for Legion. This prototype has tools for integrating parallel applications written in PVM, parallel C++, and FORTRAN wrappers. At present, the core Legion object model, incorporating mechanisms for security, fault tolerance, application-directed scheduling, autonomy, and scalable binding is being built. These services are not provided in the initial prototype. It may be some time before Legion becomes a reality, but the design and portions of the system may be useful for other metacomputing projects. Table 2.1 lists some common issues facing metasystem designs and how the designers of Legion propose to address them.

2.1.1 Programming Model

Legion is being designed to provide support for existing parallel programming languages like PVM and CC++. The goal of Legion's authors is to build the underlying metasystem so it is easy to \plug in" existing parallel languages by writing libraries which translate parallel procedures in the parallel language into calls to the Legion core system. As new parallel languages are created, they can be supported by Legion with limited e ort. System support for \o -the-shelf" debuggers like dbx is also planned. Legion will have a single name space for resources, including application objects, les, and hardware resources. This single naming scheme will make it easier for programmers to identify and locate these items.

2.1.2 System Architecture

Legion is still in the early stages of development, so there are still many architecture issues to work out. Legion will not require \root" access to participating systems. However, some form of user access is required for the system daemons to run in. To support heterogeneous resources, Legion's system philosophy is to build a translation layer which allows Legion built applications to access resources in a uniform manner. For example, Legion has a federated le system. Legion applications requesting access to les will send the request to a Legion server (may be a daemon running on the same machine as the caller) which will translate this call into a native call for the underlying le system in which the le is stored. Legion also provides automatic binary selection. For support of heterogeneous resources, Legion must provide an application binary for each system architecture under which the application will run.

Ph.D Research Proposal

Service Resource Management Communication Security Information Health and status Remote data access Executable management

Name GRAM Nexus GSI MDS HBM GASS GEM

9

Description Resource allocation and process management Unicast and multicast communication services Authentication and related security services Distributed access to structure and state information Monitoring of health and status of system components Remote access to data via sequential and parallel interfaces Construction, caching, and location of executables

Table 2.2: Core Globus services [24].

2.2 Globus The Globus Project [24] is a research program for developing \large-scale high performance distributed computing environments, or computational grids that provide dependable, consistent, and pervasive access to high-end computational resources." The Globus authors are interested in creating middleware to support dynamic resource allocation, heterogeneous and dynamic computation environments, and process level security. The primary goal of the Globus project is to combine clusters of high performance systems such as the supercomputers at NCSA with those at Argonne National Labs. For this purpose, GUSTO (Globus Ubiquitous Supercomputing Testbed), a grid prototype has been created for which Globus provides low level services, like a communication grid, resource identi cation, and authentication. The middleware portion of Globus is called the Globus Metacomputing Toolkit which provides services for security, resource management, and communication. Table 2.2 lists the core services provided. Globus provides translucent interfaces to the heterogeneous resources. In other words, the interfaces manage the resources in a way which provides detailed information about the resource to the applications if desired.

2.2.1 Programming Model

Like Legion, Globus is designed to support existing parallel programming languages. Currently listed as supported or under development for support are: MPI, CC++, HPC++, PAWS, CORBA, and RPC. For MPI, a grid enabled MPI was created which uses Nexus for communication, GRAM for resource allocation, and GSI for authentication. Users can write standard MPI programs which function in a variety of parallel environments, like clustered heterogeneous workstations or MPP machines or a mix of both.

2.2.2 System Architecture

As with the programming model, Globus \grid enables" existing system architectures. For resource management, GRAM provides an interface between Globus and local resource managers among which the following are currently supported: Network Queuing Environment (NQE), EASY-LL, LSF, LoadLeveler, Condor [13], and a \fork" daemon.

Ph.D Research Proposal

10

For security, the Globus toolkit currently focuses only on authentication with the thought that the other security issues, such as authorization and privacy will eventually be built on top of the authentication layer. This is an especially dicult issue because the the system must support N-way security, or any-to-any security, in contrast with traditional client-server application authentication.

2.3 Charlotte Charlotte [4] is a metasystem based solely on Java without any native code. Clients are run entirely as Java applets within a web browser.  A user can execute a parallel program on a machine he/she does not have an account on.  Neither a shared le system nor a copy of the program on the local le system is required.  Local hardware is protected from programs written by \strangers".  Any machine on the Web can join or leave any running computation, thus enabling dynamic resource utilization. The use of Java applets makes many of the diculties present in Legion and Globus disappear. Portability, heterogeneity, and security are no longer a concern. However Charlotte also su ers from the restrictions of the applet environment. Currently, Java performance is at least 40% slower than machine code applications written in C, C++ or FORTRAN. Communication is another issue. Direct client-to-client communication is not currently supported, so the communicating clients must use the server as a \pass through". This indirection can cause unacceptable delays for communication intensive parallel applications. Another issue is idle usage management. Similar to a screen saver, it is desirable for the metasystem to use processor cycles when the machine is idle but immediately shutdown, not just suspend, when other activity ensues on the machine. This type of automatic resource control is not possible from within an applet. Another issue is le access. The applet environment is not allowed access to the local le system. Hence, there is no possibility of storing temporaries in les or accessing data les, except through the server. An example where local le access would be useful is for our twin primes application discussed in Section 4.1. We currently use a 3MB le containing all prime numbers up to the 108. In an applet environment, each time the applet is started this le would have to be either downloaded from the web server or the data would have to be regenerated by the client.

2.3.1 Programming Model

Unlike Legion and Globus, Charlotte provides its own parallel programming model consisting of a set of classes and interfaces using a distributed shared memory paradigm. A Charlotte program consists of alternating sequential and parallel steps. The sequential step is programmed in a standard sequential Java code. The computationally intensive parts are done in parallel steps in one or more routines. A routine is analogous to a thread which can execute remotely. Parallel steps start and stop with the keywords parBegin and parEnd. Figure 2.1 shows a sample code snippet for matrix multiplication using a class which inherits from Charlotte's Droutine.

Ph.D Research Proposal

11

public class MatrixMult extends Droutine{ public static int Size = 500; public Dfloat a[][] = new Dfloat[Size][Size]; public Dfloat b[][] = new Dfloat[Size][Size]; public Dfloat c[][] = new Dfloat[Size][Size]; public MatrixMult() { ... } public void drun(int numTasks, int id) { int sum; for(int i=0; i