A Web-Based Distributed Programming Environment - CiteSeerX

NORTHWESTERN UNIVERSITY

A Web-Based Distributed Programming Environment

A DISSERTATION SUBMITTED TO THE GRADUATE SCHOOL IN PARTIAL FULFILLMENT OF THE REQUIREMENTS for the degree DOCTOR OF PHILOSOPHY

Field of Electrical and Computer Engineering By

Kiyoko F. Aoki

EVANSTON, ILLINOIS December 1999

c Copyright by Kiyoko F. Aoki 1999 All Rights Reserved

ii

ABSTRACT A Web-Based Distributed Programming Environment Kiyoko F. Aoki

A Java-based system, the GeoJAVA System, that allows a user to remotely compile his/her own C/C++ programs and execute them for visualization among a group of remote users is developed. The advantage of this system is that users no longer need to maintain the large libraries with which their code must be compiled to develop geometric algorithms. These libraries are already provided by the GeoJAVA System. The user simply needs to access the system through the World Wide Web (WWW), upload their les(s), and execute the compiler. The GeoJAVA System is useful only when the executable resulting from the remote compilation is to be run either on the server or on a system of the same architecture as that of the server. Furthermore, the user can compile programs only with the libraries available on the system as they are. To make the GeoJAVA System more versatile, a \new and improved" system called DISPE, for DIStributed Programming Environment, is also developed. Using Common Object Request Broker (CORBA) services, executables compiled with this sytem can invoke methods in libraries on remote sites in an architecturally heterogeneous environment. These libraries can be added or modi ed as long as they are registered in the system. Before being registered, libraries may be \CORBAized," which is the process of generating a CORBA wrapper for a library that allows executables to call its methods, even when it is located on a remote host. Remote debugging capabilities have also been incorporated so that users have access to a complete programming environment over the web. These features can all be used transparently to the user, with no modi cations to the user's code. iii

To my parents, Takayuki and Kazuko Aoki

iv

Acknowledgments

I would like to thank my advisor, Prof. D. T. Lee, for his continual encouragement and guidance during all my years at Northwestern University. I would not be at the position I am today without his support.

I would like to give my appreciation to the members of my nal examination committee, Prof. Lawrence Henschen, Prof. Valerie Taylor, and Prof. C. H. Wu for serving on my committee.

My deep appreciation goes to Dennis Sheu, Cindy Shen, et. al. for introducing me to the GeoMAMOS project and guiding me throughout my undergraduate and graduate school years.

Many thanks go to the following people for their assistance in the project, including Roger Lee, Mark Chin-Chuan Hsu, Mehmet Sayal, Lisa Singh, Takashi Yoshikawa, and Benjamin McLean.

Last, but not least, I would like to give my deepest thanks to my parents, Takayuki and Kazuko Aoki, for their belief in me, and for all the support they have given me. Words cannot describe the appreciation I have for them.

v

Contents Abstract

iii

Acknowledgments

v

1 Introduction

1

2 Related Work

6

1.1 The GeoJAVA System . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 DISPE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 2.2 2.3 2.4 2.5

Visualization Tools . . . . . . . . . . . . Collaboratories . . . . . . . . . . . . . . Parallel/Distributed C/C++ Compilers . Agents . . . . . . . . . . . . . . . . . . . CORBA . . . . . . . . . . . . . . . . . . 2.5.1 CORBA Implementations . . . . 2.5.2 CORBA-Based Applications . . . 2.6 Debuggers . . . . . . . . . . . . . . . . .

3 Design Speci cation

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

1 2

6 9 11 12 17 18 20 21

24

3.1 Agents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 vi

3.2 Compiler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.3 The CORBAizer . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.4 Debugger . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

4 System Architecture

29

5 Implementation

33

4.1 The GeoJAVA System Architecture . . . . . . . . . . . . . . . . . . 29 4.2 DISPE Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . 31 5.1 The GeoJAVA System . . . . . . . 5.1.1 MultiServer . . . . . . . . . 5.1.2 ChannelGuide . . . . . . . . 5.1.3 GeoJAVASheet . . . . . . . 5.1.4 GeoLIB Geometric Library . 5.1.5 Chat Box . . . . . . . . . . 5.1.6 Compilation Tool . . . . . . 5.2 Compilation Agents . . . . . . . . . 5.2.1 libbfd . . . . . . . . . . . . 5.2.2 IBM Aglets . . . . . . . . . 5.2.3 Java Native Interface (JNI) 5.2.4 Agents and Compiler . . . . 5.2.5 Registration Tool . . . . . . 5.3 The CORBAizer . . . . . . . . . . 5.4 Debugger . . . . . . . . . . . . . . 5.4.1 Communication . . . . . . . 5.4.2 The GUI . . . . . . . . . . .

vii

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

33 33 36 37 39 43 43 51 51 53 54 54 66 68 70 71 73

6 Example 6.1 6.2 6.3 6.4 6.5 6.6

Source Code . . . . . . . . \Traditional" Compilation Agent Compilation . . . . Debugging . . . . . . . . . The CORBAizer . . . . . Library Registration . . .

7 Conclusion

7.1 Summary . . . . . . . . 7.2 Future Work . . . . . . . 7.2.1 GeoJAVASheet . 7.2.2 Agent Compilers 7.2.3 The CORBAizer 7.2.4 Debugger . . . . 7.2.5 Registration Tool 7.3 Concluding Remarks . .

. . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

75 75 76 77 81 85 92

95 95 96 96 96 97 97 98 98

Bibliography

100

A libbfd

105

A.1 A.2 A.3

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 asection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 asymbol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

bfd

B CORBA Terminology

117

C Java Programming Language

122 viii

List of Figures 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

GeoJAVA System architecture . DISPE architecture . . . . . . . ChannelGuide Applet. . . . . . GeoJAVASheet Applet. . . . . . GeoJAVASheet Application. . . Chat Box Applet. . . . . . . . . Login Screen . . . . . . . . . . . Code upload page . . . . . . . . Successful Compilation . . . . . Unsuccessful Compilation . . . Program Execution Page . . . . Algorithm Browser . . . . . . . Section Output . . . . . . . . . Successful Agent Compilation . Agents Architecture . . . . . . . Registration Tool . . . . . . . . Adding a Group . . . . . . . . . Registered Information . . . . . Remote Debugger Architecture

ix

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

30 31 36 37 39 44 45 46 47 48 49 50 53 65 66 67 68 68 72

20 21 22 23 24 25 26 27 28 29 30 31 32 33

Main GDB Window . . . . . . . . . . . . User Input Window . . . . . . . . . . . . Beginning of Compilation Agent Session Selecting Interaction Level . . . . . . . . Duplicate Include Files Found . . . . . . Finding Include Files . . . . . . . . . . . Successful Compilation . . . . . . . . . . GdbWeb Window . . . . . . . . . . . . . Watch Window . . . . . . . . . . . . . . GdbWeb Window Help . . . . . . . . . . File Dialog . . . . . . . . . . . . . . . . . New Library in Group 1 . . . . . . . . . New Library Info . . . . . . . . . . . . . New Include Directory Entry . . . . . . .

x

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

74 74 78 79 80 81 82 83 83 84 93 93 94 94

Chapter 1 Introduction The World Wide Web (WWW) has become a network of little applications (applets), mobile agents, search engines, and online tools, in addition to a source of information. It has become the norm to apply software to the WWW for more convenient access and to take advantage of system independence. One of these applications is the GeoJAVA System[4], which is a Java-based system that allows a user to remotely compile his/her own C/C++ programs and execute them for visualization among a group of users, each of whom are located at remote sites. This system provides the prototype for a programming environment on the WWW by freeing users from the need to download and/or maintain the large libraries with which he/she must usually compile their code in order to develop geometric algorithms. The user simply accesses the system through the WWW, uploads their le(s) and executes the compiler o of the web server.

1.1 The GeoJAVA System The GeoJAVA System is a web-based interactive visualization system that provides 1

2

a Java-based GUI (graphical user interface) called GeoJAVASheet,

a Java-based \chat" box for dialogue,

a C/C++ library of geometric algorithms called GeoLIB,

a compilation tool allowing users to implement user-de ned algorithms using the GeoLIB library, and

broadcast visualization of geometric algorithms.

Broadcast visualization can be applied to distance learning environments and collaborative research on geometric computing. Users can \automatically" run their own programs over the WWW for numerous users at remote sites to see, even allowing them to interact with the program on-the-spot. Also, the innovative idea of remote compilation is useful for users who cannot or do not want to install large libraries on their local machine to compile a few pieces of code. All of this is done transparently to the user; the user's code need not be changed as it is written as if it would be run on a local computer. Verbal communication is enabled through the use of a chat box.

1.2 DISPE A limitation of the GeoJAVA System was that the user could not modify the existing libraries or even compile with other libraries located elsewhere. The user was only able to compile with the libraries on the web server as they were. In addition, the executable resulting from the remote compilation could only be run on the server or on a system of the same architecture as that of the server, and debugging capabilities were not available. Therefore, the scope of the GeoJAVA

3 System was expanded to resolve all of these issues and provide even more functionality. \Automated" distributed execution capabilities provided by the Common Object Request Broker Architecture (CORBA)[29, 17] were incorporated. That is, executables compiled with this system can be downloaded and still run in a distributed manner, transparently to the user, with no modi cations to the user's code. This \new and improved" system is called DISPE, for DIStributed Programming Environment. System administrators as well as programmers are familiar with the work involved in installing, compiling and maintaining large libraries on a system for networked users. With various versions of these libraries, some that are not backwardcompatible, programming time must be taken to set up permissions, Make les, etc. even despite the eorts of software and library vendors to make this process as painless as possible. DISPE provides users with a compilation environment that frees system administrators from the burden of maintaining libraries, and it gives programmers an easier alternative for compiling, debugging, and executing their code. The web-based interface can be accessed from any Java-enabled web browser after source les have been uploaded to the web server. Compilation takes place when \agent compilers" take the source code and perform a search for header les and libraries to compile with the source les. A compilation can be performed statically or dynamically. Agents performing a static compilation will transfer the object les necessary for the compilation from remote libraries to the compiler. A dynamic compilation is a little more complicated due to the distributed nature of the system. A dynamic compilation using a shared or dynamic link library depends on the fact that both the executable and library are located on the same host so that when the program is executed, the library can be loaded into the program's address space and the program can use the

4 de nitions in the library. However, on a distributed system, it is (currently) not possible to load a library on a remote host into a local program's address space without specialized operating systems to handle such a procedure [19]. Therefore, in a dynamic compilation, the program is compiled to become one that uses CORBA when executed. This is transparent to the program because it is compiled with a \CORBA-ized" version of the original library. This new library is a wrapper for the shared library that allows users to still use the methods located in the library, even though it is on a remote host. The user need not be aware of this, however. In the latter case, when the program is executed, it will run as if it has been compiled locally, and all of the remote method invocations are handled internally. A major advantage of this dynamic compilation feature is that, like dynamic link libraries or shared libraries on a single machine, programs can transparently call methods located outside of the running process. Furthermore, when transparent upgrades are made to the shared library, the user program need not be recompiled with the new library in order to call methods from the new library, as long as the interface is unchanged. Any library can be added to DISPE by going through a simple registration procedure that informs the agents of where the libraries and include les available for compilation are located. For shared libraries, a program called the CORBAizer generates the \CORBA-ized" version of the library. This new library is then registered and used during compilation of the user's source code. In this way, another layer of indirection has been incorporated into the compilation process without having to modify any existing libraries or source les, while the functionality of the user's original program has been augmented to run in a distributed environment. Speci cs of the CORBAizer is given in Chapter 5.3. Finally, DISPE extends the GeoJAVA System by providing remote debugging

5 capabilities. Programs that have been compiled with the system via the web interface are available to the system's debugger. The user simply needs to access the web page where the \GDB Web" applet is located. A Java applet is used as the interface to the debugger, which is running on the web server. Details of the remote debugger is given in Chapter 5.4. The next chapter will discuss related work to DISPE, followed by an explanation of the goals behind the design of the system, a description of the system architecture of DISPE, the implementation details, an example of how to use the system, and an overview of future work and concluding remarks.

Chapter 2 Related Work DISPE is a system that encompasses many elds. Since it is an extension of the GeoJAVA System, it incorporates a visualization tool with a collaboratory, allowing remote users to interact and solve geometric problems. It is also a distributed compilation system, dierent from existing compilers for high performance computing or parallel computers. It takes advantage of mobile agents to search for libraries during compilation. It uses CORBA for dynamic, distributed execution, and a web-based debugger provides remote debugging services. In this chapter, we discuss related work for each respective component of DISPE since, to our knowledge, no known system provides all of the functionality that DISPE does.

2.1 Visualization Tools GeomNet [53]

GeomNet at the Center for Geometric Computing, Johns Hopkins University [53] is a system for performing distributed geometric computing over the Internet. It provides a list of GeomNet supported algorithms 6

7 from which a single user can choose an algorithm to execute. Geometric computing is distributed in that the algorithms are available for anyone on the Internet who would like to see the execution of an algorithm. However, it is not implemented for groups of users to simultaneously see the execution of a single algorithm. One of the components of GeomNet is Mocha [5] at the Center for Geometric Computing at Brown University. Mocha is a single Java applet that communicates with an \algorithm server" which allows users to select geometric algorithms for which they can provide input.

VoroGlide [71]

VoroGlide by Christian Icking, Rolf Klein, Peter Kollner, and Lihong Ma [71] is an applet that smoothly maintains the convex hull, Voronoi diagram and Delaunay triangulation of the user's input while points are added or moved. It illustrates incremental construction of the Delaunay triangulation and includes a recorded demo.

ModeMap [63]

ModeMap by David Watson [63] is an applet that draws Voronoi diagrams, Delaunay triangulations, natural neighbor circles and radial density contours on a sphere. This is a single applet that illustrates the relationship between these geometric concepts on a sphere. It also allows for moving of points.

The Geometry Applet [52]

The Geometry Applet by David Joyce [52] illustrates Euclid's Elements. It lets users set up simple geometric objects in 3D as well as constraints through the use of Java parameters, and then displays the eects as objects are moved.

8

Alpha-shape Demo [44]

The Alpha-shape demo from NCSA (which requires VRML) [44] is an online Alvis demo that serves as a web-based interface to Alvis software. It is used to clarify concepts of Alpha Shapes and Alpha Ranks. Three data sets are available.

Although the applets listed above are eective in demonstrating various computational geometry algorithms, if a researcher, say, wanted to test and develop her own algorithm, she would not be able to make any practical use of these applets, let alone demonstrate the same execution of her algorithm simultaneously on remote parties' machines. This lack of interactivity and customizability motivated the development of the GeoJAVA system.

GeoMAMOS - Geometric MAnipulation and MOnitoring System [51]

The GeoJAVA system was based on GeoMAMOS [51], part of which are GeoSheet [22] and GeoManager [3], which provide distributed visualization of geometric algorithms over a UNIX-based network. GeoSheet is the 2D GUI for GeoMAMOS, serving as the interface with which users interact to communicate with their algorithms. GeoManager provides the dynamic re-execution of algorithms, which allows users to execute their program, then modify the original input data and simultaneously see the changes in the algorithm's output. A drawback of GeoMAMOS is that it was written in C/C++ for the X-Windows environment running Unix, so users who do not have access to such machines installed with the GeoMAMOS software are not able to make use of the visualization tool. In view of the above, a system-independent version was

9 implemented in the form of the GeoJAVA system. Note that the GeoJAVA system has also been augmented with additional tools that make it more of a web-based programming environment as opposed to just a visualization tool.

2.2 Collaboratories A number of collaboratories have come into development in the last few years. Some of these systems are also based on Java, including Tango[6], Promondia[14], NCSA's Habanero[55], and Digital Equipment Corporation's Java-based implementation of Collaborative Active Textbooks on algorithms, or JCAT[7].

Tango [6]

Tango is a Java-based system that allows remote users to collaborate over the Web. Users with applications that they would like to make distributed may incorporate Tango's Application Programming Interface (API) into their code, which would allow their application to communicate to a central server that handles the \distribution" of the application. The focus of Tango is dierent from that of DISPE in that it is geared towards medical and scienti c researchers. It is very useful in an environment where collaboration is needed from dierent people with completely dierent specialties. For example, a consultation for a certain surgical procedure may require the expertise of a neurologist, cardiovascular specialist and a physical therapist, where all three need dierent views of the same data.

Promondia [14]

Promondia is a system that provides a framework for real-time group

10 communication. Its focus is on group-conferencing using a shared whiteboard, video, and chat system. Promondia's focus is also dierent from DISPE's in that it attempts to give users a foundation for realtime communication using Java, as opposed to having any distributed application-based purpose. Their focus is on satisfying the increasing demand for other network services, such as real-time data feeds, group communication and teleconferencing

Habanero [55]

Habanero is a framework for sharing Java objects with colleagues distributed over the Internet. It is similar to Tango where single-user applications are transformed into multi-user, shared applications using a provided API, but it is applicable only to Java applications. Its components need to be downloaded, and only Java components can be used for collaboration. Although all three of the above-named applications provide distributed collaboration, none of them provide all of the functionality that DISPE does. Remote compilation and remote debugging are not provided, let alone the execution of C/C++ programs that can access libraries on remote hosts during execution. In order to use the APIs provided, the user must download the libraries for the API and compile their code locally.

JCAT [7]

JCAT takes advantage of a new feature in Java version 1.1 called Remote Method Invocation (RMI) technology and allows applets on different machines to communicate with each other, with the views of an algorithm located on dierent machines. The algorithms that are

11 visualized are written in Java and are based on BALSA's notion of `interesting events' to communicate the operations of the algorithm to the remote machines[8]. \Group communication" is implemented by rst having each \student" specify the name of the \teacher's" machine where the algorithm is running. Then the \teacher" demonstrates the algorithm while the visualization is broadcast to each remote machine. The dierences between JCAT and DISPE include location independence and remote compilation. While DISPE users may access the system through a single page and form a group using a single channel name, JCAT requires channels to be formed by forcing \students" to specify the hostname of the \teacher's" machine. This requires knowledge dependent on the location of the \teacher." Also, only the teacher has control of the algorithm, and the students may not interact with the algorithm. Since no remote compilation is available, users are forced to use Java and speci cally write their code for distributed visualization and compile it locally, whereas users on DISPE can use any existing C/C++ code and need not be concerned with which parts of their code is distributed. Again, because of the wide-ranging scope of DISPE, it is dicult to compare these systems with DISPE on a one-to-one level. But it is interesting to see the extent of the similarities between these systems.

2.3 Parallel/Distributed C/C++ Compilers Much research in the area of developing ecient compilers for parallel machines is underway at many institutions and universities across the United States. Some projects are listed below.

12

HyperC [18]

Hyperparallel Technologies' HyperC is a data parallel programming language, constructed as an extention to the ANSI C language. HyperC programs can run on everything from small workstations to massively parallel computers. Thus, programmers can develop programs for parallel machines on their desktop workstations, then, for operational testing, use their intranet's network of workstations, and then move directly to a massively parallel computer.

SUIF [2]

Stanford University Intermediate Format compiler system[2] is a compiler system for researching compiler techniques for high-performance architectures. The inputs to the compiler are sequential FORTRAN77 or C programs, whose source programs are rst translated into the SUIF compiler's intermediate representation. An optimized and parallelized SUIF program is then converted to C and is compiled by the native compiler on the target machine. Although the DISPE does not make use of a distributed or parallel compiler per se, in the future, this project may readily lead to such a compiler.

2.4 Agents The terminology of agents should be discerned between agents of arti cial intelligence (AI), which are more like robots, and agents as described later in this chapter. Projects whose foci are more in the former area include the Software Agents group at MIT Media Lab[67] and Softbots at the University of Washington[13].

13

Software Agents Group - MIT Media Lab [67]

The MIT group's agents seem more to be like servants to their \owners." The servants are knowledgeable about their owner's interests, preferences, and characteristics, and perform tasks autonomously and pro-actively, depending on this information about their owner. For example, an arti cially intelligent Email agent, might know that people may have administrative assistants, that a particular user has an assistant named, say, George, that an assistant should know the boss's meeting schedule and that a message containing the word \meeting" may contain scheduling information. With this knowledge, the agent would deduce that it should forward a copy of the message[23].

Softbots - University of Washington [13]

Similarly, Softbots serve their owners over the Internet. It is a fullyimplemented AI agent that uses a UNIX shell and the World Wide Web to interact with a wide range of internet resources. Its eectors include ftp, telnet, mail, and numerous le manipulation commands. Its sensors include Internet facilities such as archie, gopher, net nd, and many more. It is designed to incorporate new facilities into its repertoire as they become available. Quite a few agent products are in development, as is evident on the Agent Society home page[43]. Many of these are agent systems that provide a framework for agent projects. The major systems among these include IBM's Aglets[21], ObjectSpace's Voyager[72], Mitsubishi's Concordia[40] , Open Group's Mobile Objects and Agents Project (MOA)[26], and KYMA Software's KYMA-Atlantis[58].

Aglets [21]

Aglets are the agents provided by IBM's Aglets Software Development

14 Kit (SDK), which is an environment for programming mobile Internet agents in Java. Aglets are Java objects that can move from one host on the Internet to another. That is, an aglet that executes on one host can suddenly halt execution, dispatch itself to a remote host, and resume execution there. When an aglet moves, it takes along its program code as well as its data. Since DISPE uses Aglets in its implementation, we will cover it in more detail in Chapter 5.2.2.

Voyager [72]

Voyager is a 100% Java agent-enhanced Object Request Broker (ORB) that combines mobile autonomous agents and remote method invocation with complete CORBA support and comes complete with distributed services such as directory, persistence, and publish-subscribe. This means that a Java event can be published on a speci c topic to a distributed group of subscribers.

Concordia [40]

Concordia is a framework for development and management of mobile agent applications for accessing information on any platform supporting Java. With Concordia, applications process data at the data source, process data even if the user is disconnected from the network, access and deliver information across multiple networks (LANs, Intranets and Internet), use wire-line or wireless communication, and support multiple client devices, such as desktop computers, personal digital assistants (PDAs), notebook computers, and smart phones.

MOA [26]

MOA was designed to support migration, communication and control of

15 agents. It was implemented on top of the Java Virtual Machine, without any modi cations to it. The initial project goals were to support communication across agent migration, as a means for collaborative work; and to provide extensive resource control, as a basic support for countering denial of service attacks. In addition, compliance with the Java Beans component model, which provides for additional con gurability and customization of agent system and agent applications has been added, as well as interoperability which allows cooperation with other agent systems.

KYMA-Atlantis [58]

KYMA-Atlantis is a platform for the development of intelligent agents and personal assistants. It allows developers to create colleagues that help users de ne tasks and rules. Further, KYMA-Atlantis runs on Windows 95 and NT and provides support for OLE 2.0, MAPI, TAPI, Intel's NSP technology, and ODBC. Utilizing a combination of rulebased inferencing, fuzzy logic and neural network technology, KYMAAtlantis is adept at solving problems that require multiple reasoning methods. Some implementations of agents are also available, including AgentSoft's LiveAgent software[61] , Autonomy's Knowledge Management Products[45], and Bunyip Information System's Uniform Resource Agents (URAs)[12], to name a few.

LiveAgent [61]

LiveAgent is AgentSoft's personal agent builder. Users create their own agents by going through a regular browsing session, and LiveAgent learns what they have done. These Java-based agents are trained to

16 perform routine tasks on the Web. They generate user-de ned browsing reports, and they can be further customized (parameterized browsing).

Autonomy [45]

Autonomy presents a range of Agentware software products for the automatic retrieval, organization, and distribution of knowledge and information. The knowledge management products consist of Agentware Knowledge Server and Agentware Knowledge Update. Knowledge Server uses advanced pattern recognition technology to automate categorization, cross referencing, hyperlinking and presentation of information, summarize documents and recommended related articles and documents via hypertext links, provide a visual interface for searching that presents a uni ed view of disparate data sources across the enterprise, and make each employee's knowledge base accessible to others. Knowledge Update eliminates the need for employees to surf the Internet and intranet by monitoring hundreds of speci ed Internet and intranet sites, newsfeeds and internal repositories containing Lotus Notes, HTML, word processing les, PDF les, and others, creating a personalized report informing individual employees of developments that are relevant to their speci c jobs, and alerting the user, via email, fax, pager, channel de nition format, or push technology as soon as information of speci c interest appears.

URA [12]

URAs are objects that are constructed to embody a particular set of Internet activities. To satisfy an information need using this technology, a client program would invoke an instance of a particular URA, and supply it with details to characterize the particular information

17 need. For example, to conduct a search for music, the client might invoke the \Music URA", and specify which group or composer was of particular interest. The \Music URA" would be aware of the best music Web sites to visit and how to ll in their forms. It might also know about music-related mailing lists or news groups and lter their archives for information about the desired group. One might note that all of these agents are focused on text processing, particularly over the World Wide Web, which is understandable considering the vast amounts of textual information made available over the Internet.

2.5 CORBA The Object Management Group (OMG) developed the CORBA standard in response to the need for interoperability among the rapidly proliferating number of hardware and software products available. CORBA allows applications to communicate with one another no matter where they are located or who has designed them. CORBA 1.1 was introduced in 1991 by OMG and de ned the Interface Definition Language (IDL) and the Application Programming Interfaces (API) that enable client/server object interaction within a speci c implementation of an Object Request Broker (ORB). CORBA 2.0, adopted in December of 1994, de nes true interoperability by specifying how dierent implementations of ORBs can interoperate. At the time of this writing, the current (formal) version of CORBA is version 2.2. The ORB is the middleware that establishes the client-server relationships between objects. Using an ORB, a client can transparently invoke a method on a server object, which can be on the same machine or across a network. The ORB

18 intercepts the call and is responsible for (1) nding an object that can implement the request, (2) passing the parameters for the request to the object, (3) invoking the method, and (4) returning the results. The client does not have to be aware of the object's location, its programming language, its operating system, or any other system aspects that are not part of the object's interface. In so doing, the ORB provides interoperability between applications on dierent machines in heterogeneous distributed environments and seamlessly interconnects multiple object systems[46]. Appendix B provides a glossary of terms speci c to CORBA.

2.5.1 CORBA Implementations Since the inception of CORBA, a number of dierent vendors have implemented their own versions of CORBA, each a little dierent from the other, especially where the CORBA speci cation was not very detailed. As is described in Chapter 3, in implementing DISPE, requirements in deciding upon a CORBA implementation included adherence to version 2.2 of the CORBA speci cation, ease of use of the API, and availability. Based on these requirements, the following CORBA implementations were the most applicable.

The ACE ORB (TAO) [68]

TAO[68] is a real-time ORB end system designed to meet end-to-end application quality of service (QoS) requirements by vertically integrating CORBA middleware with operating system I/O subsystems, communication protocols, and network interfaces. It is written on top of the ADAPTIVE[42] Communication Environment (ACE)[41], which is an object-oriented network programming toolkit written in C++. This system was attractive in that it claimed to provide real-time services. However, because TAO was written on top of ACE, the application

19 programming interface (API) was not straightforward, especially for users new to CORBA in general. So for the current version of DISPE, we opted not to use TAO, but because of its completeness, we may consider using TAO in the future.

ORBacus [64]

ORBacus[64] is a full-featured object request broker that is free for noncommercial use. Unfortunately, at the time of this writing, ORBacus version 3.1.1 is compliant only up to the CORBA 2.0 speci cation. Version 4.0 of ORBacus claims to support the CORBA 2.2 speci cation, but it was not available in time for our implementation of DISPE.

MICO [62]

MICO[62], which stands for \Mico Is COrba," is a CORBA implementation that is CORBA 2.2 compliant. The source code is freely available under the GNU General Public License. According to their documentation, the following design principles are what guided the implementation of MICO:

start from scratch: only use what standard UNIX API has to oer; don't rely on proprietary or specialized libraries.

use C++ for the implementation.

only make use of widely available, non-proprietary tools such as those provided by GNU and Cygnus.

omit bells and whistles: only implement what is required for a CORBA compliant implementation.

use a clear design even for implementation internals to ensure extensibility.

20 MICO has a clean API that supports the CORBA 2.2 speci cation, and it is freely available, so we ended up using this vendor's API.

Others

There are several other popular vendors who provide CORBA implementations, such as Inprise's VisiBroker[73] and IONA's Orbix[56], but they were not considered because either they did not support CORBA 2.2 and/or they were commercial products and their free demo versions, if any, were only usable for a limited time. A full list of CORBA implementations can be found in [65].

2.5.2 CORBA-Based Applications There are several projects underway using CORBA, as can be seen in [49]. The ones that seem to be the most closely related to DISPE are the DOMIS Project at MITRE[50] and GOODE at the University of Lille[54].

DOMIS [50]

DOMIS, which stands for Distributed Object Management Integration System, is an Air Force-sponsored research project that is investigating the role of distributed object technology in building evolvable systems in the Air Force, especially those that require legacy software integration. For experimentation purposes, the project is using commercial object request brokers (ORBs) to integrate existing legacy systems.

GOODE [54]

GOODE, an acronym for the Generic Object-Oriented Dynamic Environment project, aims at building an open support environment for cooperative applications. Two tools that have been developed by this

21 project group are CorbaWeb[48] and CorbaScript[47]. CorbaWeb is an environment providing a generic gateway to integrate CORBA and the WWW. It allows users to browse and invoke CORBA objects via web documents that are dynamically generated. CorbaScript is an interpreted object-oriented scripting language for CORBA environments. CorbaScript programs are scripts that can invoke any operation, get and set any attributes on CORBA objects. Any OMG IDL interface can be implemented by the scripts, and it provides a dynamic binding to OMG IDL descriptions, which are extracted from the Interface Repository and are made directly available to the scripts.

2.6 Debuggers DISPE's remote debugger is a sequential program as opposed to a parallel one, mainly because the compiled programs are assumed to be sequential. However, because of the multiple mobile agents that will be dispatched to compile the programs, it may be possible to develop parallel programs based on this system and thus extend the debugger to run in parallel as well. Therefore, a multi-threaded or parallel debugger may become useful in the future. Based on this, some related work include the following:

KDB [9]

KDB [9], or Kalli's DeBugger, is a multithreaded debugger for debugging multithreaded C++[10] applications. C++ is an extended version of C++ providing light-weight tasking facilities (i.e., user-level threads) using a shared-memory model. Both uniprocessor and multiprocessor C++ applications may be debugged. KDB is based on

22 parts of the gdb debugger[35].

GUARD [1]

GUARD[1] is the rst implementation of parallel `relative' debugging technology. At the simplest level, GUARD is like many other debuggers that support parallel architectures, in that the developer can control the execution of each of the individual processes in a parallel execution environment. It allows the process identi ers associated with each process in a parallel application to be used to direct debugging commands to individual processes, or a group of processes. Output generated by each process is labelled so that it can be easily distinguished. In addition, GUARD adds `relative' debugging, which is a technique that makes it possible to compare execution of two programs with the same functionality. The programs can run on the same computer, or on dierent computers connected by a network. They can dier in their source programming language or in implementation details. One of the programs usually serves as a reference version which produces the desired results. The execution of the other process is compared to the reference version in order to nd discrepancies. The result of the comparison is a list of dierences in targeted data structures. GUARD allows these dierences to be stored in Hierarchical Data File (HDF) format, so that they may be imported into a visualization package.

MAD [16]

MAD[16], or the Monitoring And Debugging Environment at GUP Linz, Johannes Kepler University Linz, consists of a set of tools that help the user in detecting errors in parallel programs for distributed

23 memory machines. More information can be found at http://zeus.gup.unilinz.ac.at/research/debugging/.

Chapter 3 Design Speci cation Because we were building on top of the GeoJAVA System, we wanted to keep three major ideas in mind while implementing DISPE: portability, modularity, and scalability. The GeoJAVA System was already portable because all of its components were accessible via the web, so any machine connected to the web could access the system without concern for the machine's architecture. The components of the original system were also written in modules, as will be described in Chapter 5, and it was desirable to have anything built on top of the GeoJAVA System to be scalable. Portability had to be supported especially for the CORBAizer, which needed to be executable on any system that was to make libraries available to DISPE. Chapter 4 will describe how the CORBAizer is a separate application from the rest of the system. Modularity was desired for easy maintenance, and scalability was needed for handling increasing numbers of users and agents. We also wanted to use a web interface for the system for easy access to the compilation agents and debugger. This chapter will describe the speci cations within which the components of DISPE were implemented on top of the GeoJAVA System. 24

25

3.1 Agents According to [20], in which the dierences between various mobile agent systems are analyzed, the widespread adoption of Java is what triggered the explosion of interest in mobile agent systems. Up until a few years ago, only a few mobile agent systems were being developed. But during the past couple of years, over a dozen Java-based agent systems have been announced and made available. Because of the support and development potential of Java, the mobile agent system in DISPE is also implemented in this popular programming language. Among the agents listed in Chapter 2.4, the one that was most appealing was IBM's Aglet Software Development Kit (ASDK). It provides a simple interface that is easy to learn, and since the aglet model re ects the applet model, programming aglets are easy to pick up. It also allows multiple agent servers (called Contexts) to run on a single machine, easing the debugging process in developing mobile agents.

3.2 Compiler For the Java-based agents to call a C++ compiler, we had to make use of the Java Native Interface (JNI) [15] provided in JDK version 1.1. Although JDK version 1.2 has better support for JNI, we chose to use the previous version due to the fact that IBM Aglets currently do not support the latest version. The original proposal for the compiler was to use Cygnus' GNU-Win32 compiler on Microsoft Windows NT because the GeoJAVA System was originally on Windows NT. However, we came across diculties with support for JNI in the version of GNU-Win32 we were attempting to use, so the GeoJAVA System was ported to a Solaris machine. Under UNIX, we are currently using the GNU C++ compiler version 2.8.1. As Cygnus was informed of their bug in supporting JNI,

26 they have since made their latest version compatible with JNI, so our system is portable to 32-bit Windows architectures. The portion of the compiler that needed to be modi ed for DISPE was the linker. In order for the agents to access the linker, it had to be compiled into a shared library, and communication between the (Java) agents and (C) compilers performed via JNI. There are two \forms" of compilation performed in a \traditional" environment: static and dynamic compilation. A static compilation copies all of the symbol de nitions from libraries into the executable so that all of the code to be executed is packaged into the program. Dynamic compilations are a bit more complicated because the shared libraries used are normally expected to be located on the local machine, whereas in a distributed environment, the same libraries are not accessible to the executable when it is running. The solution to this issue implemented in DISPE is the CORBAizer, which is discussed later. Also, in a static compilation, the linking of libraries residing on machines of dierent architectures was believed to be resolvable using the GNU linker which used the binary le descriptor (BFD) library, or libbfd[11]. However, the scope of libbfd did not fully support the needs of DISPE, so the limitations of libbfd restricted our use of it. Therefore, the agents in the current system will check the compatibility of the library before attempting to use it for the compilation as opposed to compiling regardless of architecture in the hopes that libbfd would be able to provide the appropriate translations of the libraries. For dynamic compilations, however, this does not present much of a problem since the generated library need not be architecturally consistent with the original library, as long as the CORBA interfaces between the library and user program are compatible.

27

3.3 The CORBAizer The implementation of the CORBAizer required the parsing of entire header les, generating data structures to organize the components of classes given in class de nitions, distinguishing between public, protected, and private member functions and data members, etc. Luckily, the tool, \Class Exerciser" developed by Wang et al.[38] could be used for our purpose. This tool, which is based on GNU's C parser, allows users to create objects for classes from a given library, invoke methods on speci ed objects and examine the contents of the objects interactively. A portion of this tool is the Exerciser Generator, which takes as input header les and translates them into an intermediate format suitable for another portion of the tool that reacts to user's commands for invoking methods and examining objects. This intermediate format is a library generated from the header les, so we found that this portion of the tool can be modi ed to generate a `CORBA-ized" version of the library based on the class declarations in the header les. We had the original tool, written for Windows, ported to UNIX so that we could have both versions available. Then we built on top of this parser so that it could generate a skeleton of an existing library that served as an ORB interface for the original library. For the CORBA implementation, we used MICO because it was most suited for our requirements. Thus, our parser generates code that should be compiled with MICO's CORBA library. Implementation details are given in Chapter 5.3.

3.4 Debugger In implementing the debugger, we wanted to be consistent in using the GNU tools, so gdb was used as the base tool. Originally, we planned on using the Tcl Plugin

28 (Tclets)[69] because Cygnus' gdb's GUI interface was written in Tcl. However, since the web server was transferred to UNIX and Cygnus removed the GUI for their gdb in a recent version of their debugging tool, it was determined that Java would serve as a better language in implementing the web interface. Another advantage of using Java was that the security issues that cropped up in using Tclets disappeared. Creating a Java applet for gdb meant that the interface would need to communicate with gdb running on the web server via a socket connection. Unlike MultiServer, however, gdb, being a legacy application, would only be able to handle one user at a time. So one instance of gdb would need to be instantiated for each user wishing to use the debugger. Therefore, CGI scripts were written to handle the instantiation of gdb and establishing a connection between the GUI applet and gdb. The execution of the algorithm during the debugging process, on the other hand, can still be visualized among a group of remote users. In the con guration where one user instantiates a debugging process and the executable uses the GeoJAVASheet visualization tool, others users on the same channel will be able to watch the debugging process. More details of the implementation are given in Chapter 5.4.

Chapter 4 System Architecture This chapter presents an overview of the architecture of DISPE. We begin with an introduction to the architecture on the system on which DISPE is based, and we follow this with a description of the current architecture.

4.1 The GeoJAVA System Architecture The GeoJAVA System architecture is illustrated in Figure 1, which is a snapshot of the system when User 1 and User 2 are logged in and has entered the same channel. User 1 has the oor and is executing an algorithm. The dotted area containing MultiServer is the Java application that runs on the web server at a speci ed TCP port. When a new user enters, a new Connection object is created for the user's GeoJAVASheet, chat box, and possibly user program. The Connection objects are clients that MultiServer instantiates when a user enters the system, and these Connection objects communicate to their counterpart applets (i.e., GeoJAVASheet or chat box) through a TCP socket. The GeoJAVA System architecture is based on the organization of the AnnoyingChat/MultiServer applets provided by the Free 29

30 Internet Conferencing Tools (FICT). Vulture

Floor Queue MultiServer

Connection

GeoJAVASheet

Connection

User 1’s algorithm display

User 1 User 2

Chat Box Connection

User 1’s algorithm

User 1

Connection

GeoJAVASheet Connection

User 1’s algorithm display

Chat Box

User 2

Figure 1: GeoJAVA System architecture The internal data structures used to eciently manage the various Connection objects are fairly simple. The original architecture of AnnoyingChat/MultiServer only needed to handle chat boxes, so there is one global vector of Connection objects, to which new Connections are appended whenever a new connection is requested. Additionally, three other data structures have been implemented in the GeoJAVA System to manage (1) the oor, (2) GeoJAVASheets, and (3) user programs. Since each Connection object serves as an interface to MultiServer, it is instantiated as a \generic class" that only receives messages until it knows what type of object it is interfacing. Once it is informed of whether it is interfacing a GeoJAVASheet, chat box, or user program, it stores dierent information. For example, user programs would need to store the GeoJAVASheet ID to which it

31 communicates, and GeoJAVASheets need to store the user program's ID that it is visualizing. Instead of creating a dierent class for each of the GeoJAVASheets, chat boxes and user programs, one Connection class was used because at the point of the initial connection, MultiServer would not be able to determine which object to instantiate. So the Connection object is rst instantiated, and the type of object it is interfacing is determined afterwards. The Vulture object repeatedly checks the list of Connections in the case that one \dies" unexpectedly, and it updates the oor queue when it nds that a GeoJAVASheet has unexpectedly died and grants the oor to the next GeoJAVASheet in the queue.

4.2 DISPE Architecture CGI The GeoJAVA System

Compiler Agents JNI Compiler

Remote Debugger The CORBAizer

Figure 2: DISPE architecture

32 The architecture for the DISPE system is given in Figure 2. The area marked The GeoJAVA System symbolizes the portion of Figure 1 containing MultiServer, Vulture, and Connection objects, and the areas marked Remote Debugger, Compiler Agents, Compiler, and The CORBAizer are the additions to the system, resulting in the DISPE system. The area surrounding the Compiler Agents and Compiler provides the web interface via CGI scripts, and JNI enables the compiler agents and the compiler to communicate. The CORBAizer is a separate entity from the web-based interface, since it will mostly be used on libraries that are not yet connected to the system. The main motivating factor in implementing the CORBAizer was to allow shared/dynamic libraries to be made available to executables residing on remote hosts. So it is a downloadable application to allow users to incorporate their local shared (or static) libraries into DISPE. The Remote Debugger runs on the web server as a part of the augmented GeoJAVA System. The implementation details of the debugger are given in Chapter 5.4.

Chapter 5 Implementation 5.1 The GeoJAVA System The GeoJAVA System consists of six parts: (1) MultiServer, (2) ChannelGuide, (3) GeoJAVASheet, (4) GeoLIB, (5) Chat box, and and (6) the Remote Compiler. We discuss the implementation of each of these components next.

5.1.1 MultiServer MultiServer is a Java application adapted from the Free Internet Conferencing Tools (FICT) web page (now defunct). Slight modi cations were incorporated for it to provide the services for the collaboration management of the GeoJAVA system. Originally the FICT version of MultiServer managed chat boxes and channels. We have adapted it so that it keeps track of the oor queue using the Connection and Vulture classes and provides the dynamic re-execution of geometric algorithms in the library. The Vulture class has been slightly modi ed, as described later, and the Connection class has also been augmented to handle a variety of data messages. This chapter will describe the MultiServer, Connection, and Vulture 33

34 classes in more detail.

MultiServer MultiServer is the Java program running on the web server which creates the server thread and establishes the socket at the port number speci ed by the DEFAULT PORT global constant. It listens on this socket for connections from users, and, when a new connection is made, a new Connection object is created and appended to MultiServer's queue of connections. A new Vulture object is also created initially, which ensures that all of the connections are valid. Note that because all new users connect through MultiServer, groups of users need not be concerned with the actual location of a \server" host. Thus, location transparency is supported. MultiServer handles the oor control for each group by maintaining FIFO queues. When a new group is created, the oor queue for this group is empty. The rst user to press the \Floor Request" button is added to the queue and becomes the \ oor manager." Other users in the group who press this button thereafter are appended to the queue. When the oor manager presses the \Floor Release" button (or exits the system), he/she is then removed from the queue, and the successive user in the queue becomes the oor manager for the group. Finally, MultiServer also functions as the \GeoManager" of the system by dynamically re-executing the algorithm when requested. That is, after the initial execution of an algorithm, MultiServer provides the functionality to allow the user to modify the original data input and immediately see the change in the output. As the user is moving the location of, say, a point that was used as input to the algorithm, the display will be updated with the output of the algorithm corresponding to the modi ed input. This is what provides \true animation" and easier debugging of algorithms for the developer. Furthermore, when users join a channel

35 in the middle of the execution of an algorithm, MultiServer allows these users to \catch up" on the algorithm execution.

Connection The Connection object is responsible for acting as the interface between its corresponding GeoJAVASheet, chat box, or user program and MultiServer. It receives messages from their interface objects and processes them appropriately. GeoJAVASheets go through MultiServer to broadcast messages to every GeoJAVASheet in its group or to send messages to their user program. Messages from chat boxes are broadcast directly to the chat boxes of every member of its group, and the user program's messages are sent to the GeoJAVASheets in its channel. Messages from GeoJAVASheets requesting for or releasing the oor are forwarded to MultiServer with their corresponding channel, hostname and TCP port number. All of these actions are performed with the help of the interfacing Connection objects.

Vulture The Vulture class is a simple thread that informs MultiServer when a connection has been closed or lost and cleans up the lists. Whenever possible, the Connection object will notify the Vulture thread when a connection is closed. But even if the Connection objects never notify the Vulture, this method wakes up every ve seconds and checks all connections, in case a Connection unexpectedly crashes before it is able to send a \close" message. One minor addition made to this class is that if the Vulture notices that a GeoJAVASheet has crashed while holding the

oor, then it noti es the next GeoJAVASheet in the oor queue that it has been granted the oor.

36

5.1.2 ChannelGuide ChannelGuide is an applet that ensures that multiple users do not enter the system with the same username. This is the applet that the user rst sees when entering the GeoJAVA system. ChannelGuide takes the user and channel names requested by the user, communicates with MultiServer to check the current user lists for duplicates, and responds with the appropriate information, either allowing the user to start up GeoJAVASheet or prompting for a dierent user name. The ChannelGuide applet running on an X-Windows system is shown in Figure 3.

Figure 3: ChannelGuide Applet. ChannelGuide functions by rst communicating with MultiServer, requesting the lists of channels and users currently on the system. Once these lists are received, it processes them to display. If a used username is entered, then a message is displayed indicating that the entered name is invalid. Otherwise, the ChannelGuide window disappears, and a GeoJAVASheet and chat box are initiated with the user's user and channel name.

37

5.1.3 GeoJAVASheet The GeoJAVASheet applet is actually a frame that contains (1) a panel onto which users may input graphical objects such as points, line segments, triangles, rectangles, polygons, polylines, circles, arcs, and (un)weighted and (un)directed graphs, (2) a row of buttons on top: Return (for communication with the user's application program), Undo (undo the previous action), Delete (a speci c object on the panel), Modify (an object's component), Move (an entire object), Delete All, Quit, Toggle Grid (reference lines), (3) a choice box on the left to select an object to input onto the panel, (4) a \ oor" button under the choice box (this will be explained later), and (5) a row of property selectors on the bottom, such as line widths, line colors, font styles, font sizes, line styles, and ll styles. Figure 4 shows a GeoJAVASheet running on a Microsoft Windows machine.

Figure 4: GeoJAVASheet Applet. GeoJAVASheet is simply a GUI that responds to (1) messages received from

38 MultiServer, and (2) the user's actions such as hitting the Return button or requesting control of the oor. Internally, GeoJAVASheet maintains lists of the various geometric objects. Any time a new object is drawn on the panel, a new instance of that object is appended to the list to which its type corresponds. Users may modify or delete objects on the sheet using one of the buttons on the top row. Geometric objects can also be displayed (and consequently added to the lists) based on messages received from the user program. These messages are in a speci c format to determine the action to take, the data object being referenced, and the object's coordinates and properties. For example, if the user's program wants to display a red point of radius ten pixels at location (25, 30), then the message would look like: (IPC WRITE, GEOPOINT, 25, 30, 10, RED). Once the message is received, it is parsed, added to GeoJAVASheet's appropriate internal list of data structures, and displayed on the panel. The user program receives data for geometric objects by sending a request message and then waiting for a message containing the data for that object. GeoJAVASheet sends a message to its corresponding user program when the user presses the Return button, indicating that an object has been entered and is ready to be sent to the user program. When the user program has explictly requested an object for input, and the user hits the Return button, then the last object appended to the panel corresponding to that displayed in the choice box (which has been updated with the object requested from the user program) will be stored in a message to be sent to MultiServer. The user program's ID is stored in GeoJAVASheet's Connection object, so once MultiServer receives this message, it forwards it to the appropriate user program. The user program is able to control the geometric characteristics of the display, such as color, ll style, and line widths. However, this does not prevent the user

39 who is interacting with the algorithm from modifying these display properties as well. Anytime before, during or after the execution, the user is free to adjust any of these properties to his/her liking. This is a feature of the modularity of the system, where the user interface is not tied to the user program, or vice versa. There is an additional feature in the application version of GeoJAVASheet where the data on the sheet may be saved to and opened from les. Two additional \Open File" and \Save" buttons provide this option. The data is stored in XFig format[34], just as in GeoMAMOS, so that these same les can be read in by GeoSheet and XFig. Other formats will be supported in future versions. Figure 5 is an instance of the GeoJAVASheet application running under Microsoft Windows.

Figure 5: GeoJAVASheet Application.

5.1.4 GeoLIB Geometric Library The GeoLIB library in the current version consists of two parts: the LEDA [27, 24] and GeoLEDA libraries [22]. Both libraries are written in C/C++. An advantage

40 of this is that users who have already developed algorithms written in C/C++ may continue to use their algorithms without re-writing their code, and \new" users need not download a Java compiler if they do not already have one. In the future, we plan on incorporating the Computational Geometry Algorithms Library (CGAL) [32] into GeoLIB to be able to provide an even more comprehensive library for users.

LEDA The GeoLIB library is based on the basic geometric classes and member functions of the Library of Ecient Datatypes (LEDA) library (currently version 3.7) [28, 25]. It provides the foundation of GeoLIB by supplying the base classes for all of the geometric objects used in the system.

GeoLEDA The visualization portion of the GeoLIB library is contained in the GeoLEDA library. GeoLEDA consists of geometric objects that (1) inherit from the objects in the LEDA library and (2) contain visualization member functions as well as interprocess communication (IPC) methods that provide the basic socket infrastructure for communication between the components of the GeoJAVA system. It also contains functions that implement basic geometric algorithms. This library was originally developed and used by the GeoMAMOS system. However, since GeoMAMOS uses UDP, all of the IPC functions have been modi ed for TCP for the GeoJAVA system, due to reasons explained in Appendix C. This is advantageous in that although initializations are slower, communication while connected is more reliable. Next, we brie y describe the main functions from the GeoLIB library that the programs use in order to visualize algorithms.

41 IPCJavaSetup(), IPCJavaSetup(char* host, int portnum)

This function sets up the initial TCP connection between the user program and GeoJAVASheet. It can take no arguments, in which case the user will be prompted for the host and port number at the command line, or it can take the host and port number for an input and output GeoJAVASheet that has the oor. It then creates a socket connection via which the user program communicates with its Connection object, which acts as an interface to MultiServer. Any messages sent to the Connection object (or MultiServer) can be forwarded to the appropriate GeoJAVASheet since the Connection object stores the GeoJAVASheet ID to and from which it should send and receive input data. The user program must begin with the IPCJavaSetup() function before any visualization functions are called. Graphic Read(), Graphic Write()

The Graphic Read() and Graphic Write() visualization functions are member functions implemented in every geometric object class and are issued from the user program. Graphic Read() will cause GeoJAVASheet to return to the program the last object inputted onto the sheet. The process is as follows: (1) Graphic Read() is called in the user program (GeoJAVASheet sets its choice box to the requested object type), (2) user inputs the object onto the sheet, (3) user presses the \Return" button located at the top of GeoJAVASheet, which (4) sends the data for the object to MultiServer, which in turn forwards it to the user program. The choice box setting on GeoJAVASheet is set automatically when

42 GeoJAVASheet receives the message from Graphic Read(), so the user is not expected to modify the choice box setting during a Graphic Read() call. If for some reason the user decided to explicitly change the geometric object to input during a Graphic Read(), the GeoJAVASheet would display a message indicating that an invalid operation is being attempted. In the Graphic Write() method, the \opposite" action is performed. The user program sends object data to GeoJAVASheet (through MultiServer), and GeoJAVASheet displays the object. The process is as follows: (1) Graphic Write() sends the object's data to GeoJAVASheet, (2) the object is added to the corresponding list of geometric objects, and (3) GeoJAVASheet displays the object. Because of the comprehensiveness of GeoLIB, the user is able to implement practically any geometric class and use any of the functions provided by the LEDA library. Therefore, a user may develop geometric algorithms that can perform any geometric computation without concern for implementing the display of intermediate or nal results in their code. If the user wished to implement and incorporate more complex objects, the user could easily add them to the system because of the basic geometric object classes that are already provided by the GeoJAVA System. Only checks for the validity of the speci c characteristics of the object would be performed at the library level. For example, if a user wanted to use a rectilinear polygon, he/she could create a class that inherits from the polygon class and have the Graphic Read() method ensure that the polygon it receives indeed consisted of only horizontal and vertical edges. This could be done by either making another Graphic Read() call or modifying the received data to make it valid. The same idea applies to other complex classes

43 such as planar maps, which can inherit from graphs. Although GeoJAVASheet may not provide a choice box for these complex classes, by inheriting from the classes that are provided and using their Graphic * routines, any complex class can be supported by the system.

5.1.5 Chat Box The Chat Box is another applet adopted from the Free Internet Conferencing Tools and is a simple GUI consisting of a text eld in which to enter text and a textbox in which all messages from users within the same channel are displayed. It maintains a PrintStream object that handles the displaying of all of the messages, a DataInputStream that receives the messages, and a Socket class with which the connection to MultiServer is made. A list of the users currently in their channel has been additionally implemented to allow users to send private messages to individual members on the channel. A user may select a speci c username on the userlist in order to send messages to users privately, or messages may be broadcast to everyone on the channel by selecting the asterisk (*), also on the list. Figure 6 is an instance of the Chat Box running under X-Windows.

5.1.6 Compilation Tool The compilation tool originally written in the GeoJAVA System is described next. This version of the tool does not make use of agents and so has some limitations, but it is useful for those who only need to use the GeoLIB libraries. The compilation tool allows user-de ned programs written in C/C++ to be compiled and executed directly on the web server. It is a series of common gateway interface (CGI) forms that upload the code, create a corresponding Make le for

44

Figure 6: Chat Box Applet. the source les, compile, and, if the compilation is successful, run the executable with the user's GeoJAVASheet's host and port number. Unsuccessful compilations result in a web page listing the compilation errors so that the user may debug their own code and restart the process with the corrections. With this tool, users need not worry about obtaining the appropriate libraries for their system, installing, and compiling them. Of course, security precautions must be taken in order to ensure that the user's program does not contain any malicious code. It is possible for users to include system code which may corrupt the system. Therefore extra checks during the compilation stage of the tool ensure that such malicious code is not used. A combination of linking with a reduced C library and relocating the executable to a directory on the server that limits the permissions of the program provide the necessary security. Also, records of those who have been using the compilation tool keep track of the users and their actions, while still maintaining a certain level of privacy. The following illustrations demonstrate the steps in the compilation process. The user is rst admitted to the system with a valid username and password through a screen like Figure 7. This username information is saved in a cookie[66]

45

Figure 7: Login Screen to keep track of those utilizing the system. Figure 8 is where the source code is uploaded from the user's local disk to the server. The source code may consist of multiple les, both source and header les, so multiple elds are provided. The user may then select a \Traditional Compile" or an \Agent Compile." We will go through the former now, and the latter will be described later. Once the les are uploaded, the lenames are appended to a make le that contains references to the LEDA and GeoLIB libraries so that the user will be able to compile and visualize their program. The remote compilation begins when the user clicks on the \Traditional Compile" button, which causes the server to run make using the newly-generated make le. The output of the compilation is displayed upon completion. If the code is successfully compiled, as in Figure 9, then the user

46

Figure 8: Code upload page is given a link to a page with two text elds: one to enter a hostname and another to enter a port number. This information corresponds to the GeoJAVASheet's host and port number that the system brings up to visualize their algorithm. If the compilation is unsuccessful, as in Figure 10, then the error messages are displayed, and the user will need to return to the original File Upload page, where he/she must upload their corrected source les and repeat the process. In addition, a link for an online manual is available in case the user would like to consult the documentation on the usage of the functions in the GeoLIB library. If the compilation is successful, the compilation produces an executable located on the web server. Thus, when the user enters the host and port number for their GeoJAVASheet in the bottom form as shown in Figure 11 and clicks the Submit button, the CGI script will invoke the executable which will be visualized on the speci ed GeoJAVASheet (and other GeoJAVASheets on the same channel). In order to interact with an algorithm being visualized on the user's GeoJAVASheet, the user needs to become the \ oor manager" by obtaining control of

47

Figure 9: Successful Compilation the \ oor," which is like a token that only one person in a channel can hold. This token can be passed around to other users in the channel so that dierent users can interact with the visualized algorithm. The remote execution is performed after the user has obtained control of the oor (becoming the\ oor manager") and started executing their program. If multiple users wish to see the execution of the algorithm, then they should join the same channel as that of the oor manager and bring up a GeoJAVASheet of their own. The execution is then visualized on all of the GeoJAVASheets in the channel, and a chat box is also available for those who wish to communicate verbally. The control of the oor may also be passed between the users in the channel so that the interaction is not restricted to the original oor manager. The users who wish to obtain control of the oor click the \Floor Request" button, and these requests get queued in a FIFO queue. Once the oor manager releases the oor, the control is given to the next user in the queue. The oor manager's actions are the only ones that are visible to the rest of the channel. Thus, multiple users can

48

Figure 10: Unsuccessful Compilation enter their own input to the program to be seen by the rest of the users in the channel in an orderly fashion. Once the execution of the program completes, the script will display any messages that were output during execution. So run-time errors can be checked at this point. The program can be run any number of times by simply re-submitting the form. The user may also submit their code to add to the Algorithm Browser, which is shown in Figure 12. The Algorithm Browser lists the available algorithms for execution. When a new algorithm is submitted using this form, it will be displayed in the list of algorithms in the center form as in Figure 11 so that users may elect to run a dierent algorithm from their own.

49

Figure 11: Program Execution Page

50

Figure 12: Algorithm Browser

51

5.2 Compilation Agents As was described in Chapter 4.2, the DISPE system can be seen as being composed of the original GeoJAVA System plus the Compilation Agents, the CORBAizer, and the Remote Debugger. But before we get into the implementation details of these components, an overview of the Binary File Descriptor library, or libbfd, IBM aglets, and the Java Native Interface (JNI) are given to provide a framework to more clearly understand the implementation.

5.2.1 libbfd The Binary File Descriptor library[11] serves as an intermediate layer providing a uniform interface for accessing object les on varying architectures. It is a C library containing data structures and methods that allow application developers to move their focus away from architecture-speci c details by providing another layer of indirection. The main data structures used by the linker are bfd, asection, and asymbol, representing a BFD, a section, and a symbol, respectively. Each of these are described next. BFD

When an application successfully opens a target le, which can be an object le or an archive, a pointer to the bfd structure is returned. All operations on the target object le are thus applied as methods to the BFD. The abstraction used within BFD is that an object le has:

a header,

a number of sections containing raw data,

52

a set of relocations, and

some symbol information.

Bfds opened for archives have the additional attribute of an index and contain subordinate BFDs, representing the individual object les within the archive. The bfd data structure de nition is given in Appendix A.1. asection

When a BFD is opened for reading, the section structures are created and attached to the BFD. Each section has a name which describes the section in the \outside world," such as .text, .data, and .bss. A back end (architecture-speci c methods) may attach other sections containing constructor data, or an application may add a section to those attached to an already open BFD. The raw data may not necessarily be read in when the section descriptor is created. While some targets may leave the data in place until a certain function call, other back ends may read in all the data at once. To write a new object from a BFD, the various sections to be written have to be created. They are attached to the BFD in the same way as input sections. The data to be written comes from input sections attached via output section pointers to the output sections, so the output section structure can be considered a lter for the input section. Figure 13 shows a sample of of how these pointers are used. In this gure, a new section, \O" is created starting at 0x100 and it is 0x123 long, containing two subsections, \A" at oset 0x0 (i.e., at vma 0x100) and \B" at oset 0x20 (i.e., at vma 0x120). The asection data structure is listed in Appendix A.2.

53 section name output_offset size output_section

"A" 0x00 0x20

section name output_offset size output_section

"B" 0x20 0x103

section name vma size

"0" 0x100 0x123

Figure 13: Section Output asymbol

The data structure containing symbol information, called asymbol, is mainly used by architecture-speci c back end applications. Thus there is minimal support for dealing with symbols. The only way to obtain a particular symbol is by reading the entire symbol table of a BFD and performing a linear search of the table. No methods to write symbols are given since it is assumed that the only time an application would write a symbol is when an output BFD is being closed, at which time the entire BFD is written out. This lack of support is what partly forced us to nd an alternative solution in passing symbol information between agents, as discussed later. The asymbol data structure is given in Appendix A.3.

5.2.2 IBM Aglets An IBM Aglet[21], the term being a combination of agent and applet, is a Java object that can move from one host on the Internet to another. It supports the concept of autonomous execution and dynamic routing on its itinerary. It can be thought of as a generalization and extension of a Java applet and servlet. Aglets are hosted by an Aglet server, called a Context, in a way similar to the way applets are hosted by a Web browser. The Context provides an environment for aglets to

54 execute in, and the Java virtual machine (JVM) and the Aglet security manager make it safe to receive and host aglets. An Aglet Application Programming Interface (API) is provided in the Aglets Software Development Kit (ASDK). The API is a set of Java classes and interfaces that allow users to create mobile Java agents. It allows users to write agents that can run on any platform, thanks to Java's \write once, run anywhere" capabilities. The ASDK also provides the Tahiti aglet server, which is a Java application that allows the user to receive, manage, and send aglets to other computers that are running aglet servers. Note that the terms \aglet server," \agent server," and \Context" will be used interchangeably in this text.

5.2.3 Java Native Interface (JNI) Javasoft's Java Native Interface (JNI) allows Java code that runs inside a Java Virtual Machine (JVM) to interoperate with applications and libraries written in other programming languages, such as C, C++, and assembly. This is done by compiling the native code into a shared or dynamic link library, which the JVM loads when the Java application starts up. The Java code can then call any of the native methods de ned in the library. The native code can also access the JVM features by calling JNI functions, which are available through an interface pointer. Native methods receive the JNI interface pointer as an argument.

5.2.4 Agents and Compiler The following describe the work involved in implementing the agents and compiler used in DISPE.

55

The Agents The Compilation Agents are Java objects that use native compilers. Initially, only one type of Compilation Agent was thought to suce in implementing this sytem. However, upon examining the code for the GNU compiler, it was found that much of the compilation depended on the data structures loaded into memory, and so it would not be feasible to have the Compilation Agent travelling to dierent hosts to perform the compilation. Therefore, the Compilation Agent in DISPE is stationary while it communicates with several \slave" agents that do the actual searching and transfer of data. The main Compiler agent communicates with several types of agents in order to obtain information and complete the compilation process. These include the Client agent, the Include Agent and the Library Proxy Agent. While the main Compiler agent is stationary and remains at the original site at which the compilation begins, the Client agent does the travelling and searching for missing symbols during the linking phase, the Include Agent searches for include les for when the compilation detects missing include les, and the Library Proxy Agent serves as a stationary agent at each server host that responds to Client agent requests for library locations and to Include Agent requests for include le locations. Library Proxy Agents are created and disposed of by the agents who need them. There is also a Vulture class for the Compiler agent which performs some checks for fault tolerance. For example, in the case when an agent somehow \dies" unexpectedly, the Compiler agent would not know of it. Therefore, the Vulture class is spawned as a separate thread which occasionally sends a \ping" message to the dispatched agents, and if it does not receive a reply within a certain time frame, it sends a replacement agent to the same site, or, if the site has gone down, to any existing backup site.

56

The Directory In determining how we were to implement the search for hosts, include header les, libraries and symbols, we considered how the agents were written in Java, which inherently supports serialization. This led us to the most direct approach, which was to have the Compiler agent simply maintain a hash table of known hosts, header les, libraries and symbols that could be read from disk upon startup and written to disk before being disposed. This hash table would thus serve as the Directory. The advantage of this approach is that the web server's agent host system keeps a central database le of this information in a simple hash table, and no new protocols or additional management software need to be introduced into the system. Java proved to be very convenient in this respect. So when the Compiler agent is rst started, it reads four hash tables representing the Host, Include, Library and Symbol Directories containing host, header le, library, and symbol information, respectively, collected from previous compilations.

Invoking the native compiler The next step is to invoke the compiler. When examining GNU gcc's version 2.8.1 compiler, it was found that the compilation process involves numerous passes and executable les that are called via system calls. The compiler goes through preprocessing, compilation, assembly, and linking[36], each invoked by separate executables. As was determined earlier, the linking phase was what we needed to modify. However, in a normal compilation, the linker is automatically called by the compiler with various ags and options, so in order to keep consistent with the ags used during a \normal\ compilation, we rst determined which ags were needed to be passed to the linker by running a sample compilation with the verbose (-v) ag. The same ags were then used when the agent called the linker.

57 The next hurdle was to determine how to call the linker from the aglet, since, contrary to our assumption, the linker is called via a system call rather than as a function call, and we needed to somehow pass the JNI interface pointer to the linker in order for both sides to communicate. Various options were tossed around, such as having the linker itself create a JVM for linking, or having the linker attach to the running JVM. This would then be followed by the creation of a Compiler agent. However, both of these options are unavailable at the time of this writing since we are on a Solaris platform, where the support for attaching to a running JVM is not available (it is currently available only on Windows platforms) and the ASDK does not support the creation of a context automatically. In the end, we were forced to use the JVM in which the agent was running, so we turned the modi ed linker's main() into a procedure which took as a parameter the JNI interface pointer. The linker code was then compiled into a shared library which the Compiler agent loaded upon creation and ran when needed. To invoke the compiler, then, the Compiler agent rst makes a system call to invoke the C++ compiler with the -c ag, indicating that it should compile up to the linking stage, but not link. Then the linker is called via the newly created function, passing it the JNI interface pointer.

Header les Before we can even reach the linking stage, however, the compiler should rst be able to handle missing header les. First, we had to be able to determine when include les are determined to be missing. During a `normal' compilation, we knew that messages indicating that include les are missing are displayed during an earlier stage of the compilation. However, we did not want to modify the compiler in any way, and many dierent messages could be printed out during a

58 compilation. So we wrote a parser that would parse the messages displayed during this earlier compilation phase for the message indicating an include le could not be found. We had to take note of the fact that an include le may reside in a subdirectory and require other include les in the same subdirectory as well. So the parser had to be able to deduce the subdirectory, if any, of the missing les. When include les are determined to be missing, the Compiler agent will consult its own copy of the Directory to see if any of the include les have been found before. If so, then an Include Agent is dispatched to the host where the include les are located, taking along a list of the missing lenames. If these include les are not located in the Directory, the Include Agent is sent to the Main Server, which is a central host where libraries and other host information is stored. The Include Agent will then begin travelling and searching for the include le names that it has been given. When an Include Agent arrives at a host, it searches for a Library Proxy Agent, and if it cannot nd one, it creates it. The Library Proxy Agent reads the Directory of include les located at the host and sends the Include Agent a formatted list of include les. When the Library Proxy Agent rst starts up, it also informs the parent of the Include Agent, whose proxy it receives from the creating agent, of the list of hosts that it \knows" about from its own Directory. The Compiler agent will then add any new hosts from this list. In this way, new host information can be dispersed in a \natural" manner by the agents themselves. After reading all of the include les and sending them to the Compiler agent, if there are missing les remaining, the Include Agent will continue on to the next host. As long as the Include Agent has a list of hosts to travel to, it will continue on its itinerary. Once it reaches the end of the itinerary and there are still missing include les, the compilation will end since all known hosts have been visited even

59 though missing include les remain.

The Link Once this compilation phase completes successfully, the Compilation agent will begin the linking phase by making a call to the modi ed gcc linker via JNI. The linker will rst attempt to link the source code with whatever libraries are available on the local host. In a \traditional" compilation, if unde ned symbols are not found in the nal writing stage, error messages will be displayed indicating which symbols were not found, and the compilation would remove the output le before terminating. Contrary to our prior belief, these error messages are printed at the end, after attempting to merge all symbol de nitions during the link, as opposed to during the merge of the symbol de nitions from the library into the code. Therefore, we needed to determine at what point the merging was being performed and how to nd the missing symbols before the linker went into the nal writing stage. Upon careful examination of the 250,000 lines of linker code, we found that a hash table is kept for all of the symbols in the compilation, and within the hash table is a linked list of unde ned symbols. During the compilation, this linked list is used to merge in data from libraries to \pull in" symbol de nitions into the compilation. So now we had a data structure containing the names of the unde ned symbols in the link.

The Search II So now that we knew where the unde ned symbols were, what protocol should be used to search for the symbol de nitions on remote hosts? We could pass all of the unde ned symbols to multiple agents, which are dispatched to multiple

60 remote hosts, or we could subdivide the list of unde ned symbols among the agents travelling to remote hosts, in the hopes of having a \good" subdivision where the agents do not have to travel around too much, or we could pass all of the unde ned symbols to one agent, which would travel along to each host, sending back symbol de nitions to the Compiler agent as they came along. With the rst option, there would be a lot of overhead in sending multiple agents all maintaining an entire list of unde ned symbols, when, in all likelihood, only a handful of libraries would provide the symbol de nitions. The second option would depend on the subdivision, with the worst case being that the symbol de nitions would not be found until after all of the hosts have been visited. The third option may also take much time, but the overhead would be much smaller than the rst option. In the end, it was determined that the time it took to travel to a host and start up a search for symbols was worth more than the cost of the overhead in maintaining a string of unde ned symbols, so we decided to take the rst option. In fact, we implemented a little intelligence into the Compiler agent so that it did not simply dispatch Clients to all possible hosts. Recall that every time an Include agent reaches a remote host, a Library Proxy Agent is referenced to nd out about the available include les on that host. Actually, the Client goes through a similar process when it rst reaches a remote host. It will invoke a Library Proxy Agent if one is not there already, and it will rst request a list of libraries that the Library Proxy Agent is aware of. The Client will also tell it to send its parent, the Compiler agent, a list of hosts that it knows about. Whenever the Compiler agent receives a new list of hosts, it will rst consult its Directory to see if it knows of any of the unde ned symbols residing in any of the hosts and if a Client is there already. If it nds that an unvisited host indeed has symbols discovered before, or if it even nds an unvisited host, then a Client is dispatched there. Note that any Client

61 that goes to a host and does not nd any symbol de nitions immediately disposes of itself.

The Link Continued Back in the linker, we added a function in the Compiler's linker code (while trying not to break any of the existing code) that would create a formatted list (up to a speci c size) of unde ned symbols based on the list of unde ned symbols in the linker's hash table. This list is sent back to the Compiler agent which then dispatches a Client agent, passing it the same list. We limited the size of this list because of memory and network buer constraints. The Compiler agent gives the Client an initial host to dispatch to, which will depend on whether any of the missing symbols have been found before, as stored in the Symbol Directory. We then had to determine how the Client compiler would nd the missing symbols from within the libraries that it searched once it arrived at a remote site. We did not want to use a system command to call something like UNIX's nm to get a list of symbols in a library since it would make the compiler dependent on the location of nm, thus losing system independence. So we decided it would be better to try to gure out how the linker itself read symbols and merged them into the executable. Because we wanted to send back to the Compiler only the data for each symbol, we wanted to begin another \mini" compilation session that would only bring out the desired symbol de nitions and no others. So the Client would call (with no other ags) a simpli ed version of the linker. The rst step of this linker is to load the libraries at the host at which it is running. Then it had to insert the symbols to search for into the table of unde ned symbols. After this, the linker had to run to the point where the merging takes place of the symbols from the libraries with the unde ned symbols, after which, the symbols had to be

62 somehow passed from the Client to the Compiler agent. This was where we ran into another diculty. Although we could nd the de ned symbol, in what format should the symbol de nition be sent to the Compiler agent? In order not to break the current compilation setup, we needed to be able to set up the symbol as if it resided in an object le so that the Compiler agent's linker could read it and link it into the executable correctly. This introduced many architecture-speci c issues, though. As we mentioned in Chapter 5.2.1, an object le is made up of several dierent sections, all referenced by pointers in the bfd data structure. If we created an arbitrary BFD with arbitrary sections, and the format of the sections are not compatible with the architecture on which the compiler is running, then the method to close the BFD, which writes out the object le, would either fault and/or terminate the compilation. Since we did not wish to have to introduce any architecture-speci c code to the system, we had to think of an alternative to pass the symbol de nitions to the Compiler agent. Upon further investigation, and considering how we could take advantage of spatial locality, it was decided that we could take the whole object le that contains a symbol de nition and send it to the Compiler agent in an archive format, as opposed to sending each symbol de nition. This proved to be an easier method of transferring de nitions to the Compiler agent, and no architecture-speci c modi cations needed to be made to the compiler. In addition, spatial locality allowed for a possible reduction in the number and size of messages sent between the Client and Compiler agents since, in all likelihood, other unde ned symbols encountered later may already reside in the object le being transferred. So now when the Client reaches a host, it begins its own version of a partial compilation by inserting the missing symbols into the unde ned table and starting

63 a link which will \pull in" the necessary de nitions for any missing symbol de nitions it nds. When a de nition is found, the object le to which the symbol belongs will be appended to a temporary archive le. This is repeated for the entire list of unde ned symbols, ignoring any duplicate object les. Now that we had the BFD in archive format, how could the Client send it in a coherent manner to the Compiler agent? We still had the problem that this BFD consisted of pointers to the symbol de nitions located in other BFDs. Could we somehow create wrappers to the BFD structures to be accessible by agents which could respond to requests from Client agents? This would be quite a task to take care of the management of all the BFDs in a compilation. We searched for any possibly helpful methods that may be provided by the libbfd library, and nally came across bfd read and bfd write (which were not even documented). bfd read copies a BFD from a stream or memory into an array, and bfd write writes from an array into an output stream. So, we could simply call bfd read to copy the BFD into an array and send a copy of the array to the Compiler. Our next problem came when we attempted to re-create this BFD on the Compiler agent's side. We found that only two modes were available to open a BFD, reading and writing. Because we did not want to make any changes to libbfd, in order for the Compiler to be able to use the BFD that it had just received, it had to rst open an empty BFD le for writing, copy the array into it with bfd write, close it with bfd close, then open it up again in read mode. This new BFD is then added to the compilation, and the Compiler agent adds the newly discovered symbols into its Symbol Directory. Because multiple Client agents could be sending BFDs to the Compiler agent, we had to ensure that the new archive lenames were distinct. This was ensured by naming each archive in the format where is the

64 hostname of the remote host, is the port number on which the Context is running, and is a sequential number indicating the num-th BFD sent from that Client. When adding new symbol de nitions into a compilation, this new de nition may have references to symbols that are not de ned, and so new unde ned symbols would be generated. Therefore, this process of checking for unde ned symbols and searching for them if any exist is repeated. If there are no unde ned symbols remaining, the \ nal link" is performed, which is the actual sizing of the symbols in all the BFDs in the link and writing the executable to disk. If all goes well, the compilation results in a valid executable. If unde ned symbols still remain and a \repeat limit" is reached (the maximum number of \rounds" to search for symbols before giving up), then a \ nal link" is still attempted, and any remaining unde ned symbols are displayed on the GUI. At the end, the Compiler agent will clean up by disposing any remaining Client agents and writing its Directory to disk. The Directory is updated with any new host, library and symbol information. Figure 15 illustrates the architecture of the Compilation Agents used in DISPE. The oval shaped objects are the Aglets, the dotted rectangles are the Directories, the rectangles with the rounded corners are the Aglet Contexts, and the rectangle with the sharp corners is the Web Server. The arrows labeled A and B indicate the agents travelling between Contexts. The arrow labeled C is a message being sent from the Client agent to the Compiler agent. Note that remote Contexts do not maintain Symbol Directories unless a Compiler agent runs there since other agents have no need to reference a Symbol Directory.

65

Figure 14: Successful Agent Compilation

The GUI Lastly, we should mention the web interface to the Compiler agent. The ASDK originally came with two types of Contexts written in Java called Fiji, an applet, and Tahiti, a Java application. Initially, we hoped to be able to use the Fiji applet, but after the last upgrade of the ASDK to version 1.1b1, Fiji was removed because of several problems with its maintenance. So we had to gure out a way for users to access the Compiler agent from the web. We wanted to create another MultiServerlike application that waited on a socket, but Tahiti could not be modi ed because we did not have access to its source code. So instead, we wrote our own Context which created a socket and listened for messages to create Compiler agents. Then we wrote an applet that connected to the socket and simply displayed any messages that it received while listening on the socket. The Context would then take the

66

Aglet Context

DISPE Web Server Includes Host Directory Compiler

Include Directory Library Directory Symbol Directory

Main Server (A Context)

Include Directory Lib. Proxy

A

Host Directory Library Directory

Client B

C (symbol definitions)

Aglet Context

Lib. Proxy

Client Host Directory Include Directory Library Directory

Figure 15: Agents Architecture applet's socket ID and pass it to the newly created Compiler aglet. Thus, the Compiler aglet could send messages to display on the applet to indicate the status of the compilation. Consequently, we were able to develop a web interface to the compiler agents.

5.2.5 Registration Tool Now that we had the Library Proxy Agent to access libraries on a host, we had to decide how to inform this Agent of any new libraries or include les that may be added to a system. In other words, from the Context on which the Library Proxy Agent is running, we could either copy any libraries and include les that the user wished to make available to the system into a pre-de ned directory, or we could create a set of les that list the directory paths and lenames of libraries and header les available on that Context's host system. For ease of use, the latter choice was taken. But then we needed to ensure that a consistent format was followed in

67 which the library and header le names were saved in these registration les, and so the Registration Tool was developed, which is a Java application that the user runs when he/she wishes to make a library available to the agents. Figure 16 is an instance of this tool.

Figure 16: Registration Tool Each Group represents a set of libraries and include directories that depend on each other. As in Figure 17, new libraries or include directories can be appended to an existing group, or a new group of a speci c architecture can be added. Information can be displayed on existing items in a group from the main registration window by clicking on the Info button. A sample information window is displayed in Figure 18. Once a library is registered into the system via this Registration Tool, agents can nd the new library when they arrive at the host through the Context providing the registered information. In the case that a new host is being created, the host can be added to the system during a compilation, where, in the beginning, the Compiler Agent will ask if any new hosts are to be added, and the user can input

68

Figure 17: Adding a Group

Figure 18: Registered Information the information at that time.

5.3 The CORBAizer Although the compilation agents can compile with static libraries, shared, or dynamic link, libraries are a dierent story. Because shared libraries are only useful (at the time of this writing) in a system where they are loadable into the address space of the running executable [33], when we move this scenario to a distributed system, it breaks down. In addition, we still have architecture issues where libraries

69 on dierent architectures cannot be compiled together, even if they have symbol de nitions to oer. Fortunately, these issues dissolve with the CORBAizer. The CORBAizer is based on the Exerciser Generator introduced in the RunClass tool[39]. The original tool was used as an object introspection tool in C++, where users could access objects for classes de ned in a library through a user interface. Objects could be instantiated dynamically through the user's commands via the GUI, and methods could be examined within these objects. In order for a library to be inspected, the header les of the library had to be parsed and converted into a format that the system's engine could understand. The CORBAizer is based on this parser. The CORBAizer parses the header les of a library and generates two sets of C++ code: server code and client code. These source code les can be compiled separately, even on separate architectures, and used to communicate with each other via CORBA. The CORBA implementation that we used was MICO, so it may be more appropriate to use the term MICOizer, but we hope to expand to other CORBA implementations in the future. Once the CORBAizer generates the server and client source les from the header les, the server code is compiled with the original library corresponding to the input header les, and the client code is used to generate a new library that is registered with DISPE, as was described previously. The server code can either be added to the CORBA Implementation Repository, which is a daemon that will automatically call the server when a request for a method in its library is called, or the server can be run manually to wait for requests from clients. Then, once the client library is added to the system, any program using the classes and methods de ned in the original, possibly shared, library will be compiled with the client library just generated, and when the program is executed, the user's program will use CORBA to make a connection with

70 the server program which executes the appropriate methods remotely. The user need not be concerned with any details of the CORBA implementation since it is all handled by the client library. The basic idea behind the CORBAizer is that from a library's header les, we can retrieve a \skeleton" for the library's contents. We then take the skeleton and generate the server and client code. The client is a copy of the skeleton with the implementations of the objects replaced by CORBA calls for searching for and connecting with the server and invoking the \real" methods. The server code takes advantage of CORBA's tie feature [29], which allows legacy classes to be wrapped by a CORBA class that takes the legacy class as the object of a template. The legacy class is then called when the CORBA class in the server receives a request for a method invocation. An example of using the CORBAizer is given in Chapter 6.5.

5.4 Debugger The Remote Debugger consists of the executable, the debugger, and the web interface. The executable is the one just compiled as in the above, the debugger is a modi ed version of GNU's gdb, and the web interface is a CGI script that starts up the gdb applet (the GUI) and sets up the socket connection to allow communication between the debugger and the GUI. The latest version of gdb at the time of this writing, version 4.17, was used and modi ed for this system. It consists of various modules called \hooks" that get called by the debugging system. These modules allow programmers to extend the functionality of the debugger and thus allowed us to extend the debugger for the web. The following describes the extensions made to gdb.

71

5.4.1 Communication Because we are porting a program designed to be run on a single system to a distributed system, the communication modules are the most important additions made to the original program. These additions are implemented in the following hooks:

init ui hook: used for initialization of the user interface and the sockets for communication. This hook is handled before any others.

command loop hook: receives gdb commands from the user. This is a while loop that continuously receives command messages and forwards them to the function execute command. It is the main body of gdb and terminates when it receives the \quit" command.

fputs un ltered hook: handles gdb output, which are either ltered or un ltered. The ltered output is simply formatted so that it can be passed through this hook, which sends the output to the GUI.

By using sockets, one problem we encountered was that messages were buered in blocks so that when, say, gdb's output messages were sent to the socket, the GUI would not receive them until the socket was closed. We thought of two options to attempt to work around this dilemma. We could either modify the user's code being debugged or write a preprocessing program. These options were based on the setbuf function provided by the C library. By setting the stream buer to a value of 0 using this function, the stream would become unbuered and our problem would be resolved. However, using setbuf entailed either adding it to the source code of the debugged program before it was compiled, or adding it to some initialization routine that would be called before executing the program

72 being debugged. Neither of these options were very attractive since they could potentially cause many side eects. The UNIX FAQ [70] oered a better solution: to use a pseudo-terminal to make the socket \think" it is communicating with a terminal, thus leaving the messages it receives on the socket unbuered. Gdb thus sends messages to the GUI by sending them to the pseudo-terminal, which forwards the messages directly to the GUI. Likewise, messages from the user via the GUI are sent to the pseudo-terminal which get sent to gdb. This makes the interaction between the user and gdb the same as if the user were running gdb locally. Figure 19 is an illustration of the design of the remote debugger.

Figure 19: Remote Debugger Architecture Although the use of a pseudo-terminal on a socket resolved the problem of the buered messages, we had another problem where the inputs for gdb and the program itself (such as in the case that it used I/O functions such as scanf), were both now listening on the same socket. So if the user program requested any input from the user, the user's input would never reach it since gdb received all

73 messages on the socket as a command. In order to resolve this problem, we had two options. Either we separate the inputs for gdb and the user program into two channels so that each could receive input directly, or we set up a protocol where both inputs are still received on the same channel, but we somehow develop a way to distinguish between the dierent types of messages. Although the latter method would save resources on the server, it would also be slower and messier to code. Thus, the former method was chosen. This resulted in the two windows used in GdbWeb, the Command window and the Watch window, shown in Figures 20 and 21, respectively.

5.4.2 The GUI The user interface is a Java applet that performs three functions: 1. communicate with gdb running on the server via a socket 2. receive and store a history of commands from the user 3. display the output from gdb and the user program in separate windows The CGI script rst starts up gdb on the server, passing the executable name as a parameter, and retrieves the port number on which it is waiting. This port number is then used as a parameter to the gdb applet, which creates the Watch and Command windows.

74

Figure 20: Main GDB Window

Figure 21: User Input Window

Chapter 6 Example In this chapter, a sample session that a programmer may have in the DISPE system will be demonstrated.

6.1 Source Code A simple program that uses the GeoLIB libraries provided by the GeoJAVA System follows. #include #include int main(int argc, char** argv) { cout 1) IPCJavaSetup(argv[1], atoi(argv[2])); else IPCJavaSetup(); cout Graphic_Read(); ClearSheet(); GeoPause("Cleared Sheet"); P->Graphic_Write(); GeoPause("Wrote point"); }

return 0;

This program simply establishes a connection with the visualization tool, which reads a point from the user. The tool then clears the sheet and draws the same point.

6.2 \Traditional" Compilation This program, named GSprog.c, is written in C/C++ and has only a few simple \rules" for its structure to follow in order to work with the GeoJAVA system. These rules consist of including the appropriate include les and calling IPCServiceSetup() at the beginning of the program. Then the code may make calls to Graphic Read() and Graphic Write() for the geometric objects used. Details of these functions are given in Chapter 5.1.4. When a C/C++ program is executed with arguments, these arguments are stored in the argc and argv variables, where argc is an integer indicating the number of arguments (including the program name) and argv is an array of strings, with each string storing the argument(s) passed to the program. For example, say the program name is \visib" and its arguments are the host and port number, \geo" and \1234." Then argc would equal 3 and the array argv would contain f \visib", \geo", \1234"g. Since IPCJavaSetup() is overloaded as IPCJavaSetup(char*

77 , the second argument to IPCJavaSetup() must be converted to an integer, which is done using the C library function atoi(). In this example, before the call to IPCJavaSetup(), the program checks its arguments and assumes that if arguments exist, they are the host and port number for the GeoJAVASheet to which it should communicate. This format should be followed, especially if the user wishes to execute their program o the web server, which will execute the program with the host and port number as arguments. After logging in to the DISPE system, the user uploads the le via the File Upload page as is shown in Figure 8 and clicks on the Traditional Compile button. The next page will display the output of the compilation, whether it is successful or unsuccessful. If the former, then a page to either debug or execute the program is given. Before we get to either of these steps, let us see what happens during an agent compilation. host, int port)

6.3 Agent Compilation After logging into DISPE and selecting the le(s) to upload, this time we click on the Agent Compile button. On the web server's side, the Context for the Compiler Agent is waiting on a socket, and will dispatch agents when it receives messages on the socket from the GUI. Since the user has logged in to access the system, the username information is already stored in the agent, and so Figure 22 displays the messages seen by a user who has already gone through at least one agent compilation session before (as indicated in the gure by the \Welcome Back!" message). For this example, a new session will be created. In the case that a rst time user is running the compilation, this prompt will not be displayed, and the rest of

78

Figure 22: Beginning of Compilation Agent Session this session would be the same. The next question asks which level of interaction the user would like to use for this session. If the user selects no interaction, then the rst header les and libraries encountered during the compilation will automatically be selected and included in the compilation. If the user selects library interaction, then any duplicate include lenames or library names found on a single host will be displayed and the user will be prompted to choose either one of the given lenames or none. The option for none is given in the case that a library or include le is found on a separate

79 host from where these duplicate les reside. Figure 23 promps the user to select an interaction level.

Figure 23: Selecting Interaction Level The nal question asks whether the user has set up his/her own Context containing registered libraries that he/she would like to use in the compilation. For this example, no new Contexts will be added. After these initial questions have been answered, the actual compilation process begins. Since library level interaction is used, a message indicating that duplicate include les have been found on host geometry is displayed, as in Figure 24.

80

Figure 24: Duplicate Include Files Found The full path names of the les found are also displayed. From the path names, the include les in the /geometry/TCPaurora/incl directory are chosen. After a selection is made, any other choices between duplicates found in these two directories will default to the one chosen rst. This is evident in the \Found include le..." messages in Figure 25. Recall that during the registration procedure introduced in Chapter 5.2.5, the include les and libraries are grouped together. So just as the decisions made for the include les are \remembered," the libraries corresponding to the include

81

Figure 25: Finding Include Files directories chosen are also \remembered," so no questions about duplicate libraries are asked, and the linking phase completes successfully, as in Figure 26.

6.4 Debugging Now that the program has been compiled, the debugger can be used to walk through it. Figures 27 and 28 display the two debugger windows when it rst starts up. The same debugger is used whether we compile our program using a

82

Figure 26: Successful Compilation \Traditional Compilation" or a \Agent Compilation." In the GdbWeb window, any of the commands that GNU's gdb takes can be typed, and the Watch window will display any stdout or stderr messages as well as read text for stdin. Figure 29 shows the GdbWeb window \in action."

83

Figure 27: GdbWeb Window

Figure 28: Watch Window

84

Figure 29: GdbWeb Window Help

85

6.5 The CORBAizer A CORBAizer session is demonstrated next. A header le named testlib.h will be used in this example. #ifndef TESTLIB_H #define TESTLIB_H #include // globals extern int MyGlobalVariable; int MyGlobalParameterFunction(char*); class MyLocalClass { public: // Constructor MyLocalClass(); // Destructor ~MyLocalClass(); void SetValue(int); int GetValue(); private: int MyValue; }; #endif /* TESTLIB_H */

When the le is run through the CORBAizer telling it that it belongs to \testlib," the following les are generated: DISPE 1 0 testlib.idl, tdserver.cc, tdserver.h, tmp/newLib.cc, and tmp/testlib.h. The les in the tmp subdirectory are the client les, while the rest are the server les. The Interface De nition Language (IDL) le provides the interface de nitions for the library based on the input header le(s). This le should be run through MICO's IDL compiler program in order to generate the skeleton source les for the interface. The IDL compiler will generate DISPE 1 0 testlib.cc and DISPE 1 0 testlib.h. Along with the server les generated by the CORBAizer,

86 these les are compiled with the MICO library and the original \testlib" library to create the server. The following is the IDL le generated from testlib.h. interface DISPE_1_0_MyLocalClass { void SetValue( in unsigned long arg0); unsigned long GetValue(); }; interface DISPE_1_0_GLOBALS { unsigned long DISPE_1_0_GLOBALS_MyGlobalParameterFunction(in string arg0); attribute unsigned long DISPE_1_0_GLOBALS_MyGlobalVariable; }; interface DISPE_1_0_testlib { DISPE_1_0_GLOBALS create_globals(); DISPE_1_0_MyLocalClass create_MyLocalClass(); };

Class de nitions basically correspond to interfaces in CORBA, and since any private members of MyLocalClass would not be used by any user program, it is not generated in the IDL le. ints are not recognized in the IDL, so the CORBAizer converts them to long. The interface de nition for DISPE 1 0 GLOBALS is used to hold global functions and variables, and DISPE 1 0 testlib is a factory that creates distinct instances of objects when requests are received for them. (If we did not use a factory, we could only have one instance of each class in the server.) The following is the tdserver.h le. class DISPE_1_0_GLOBALS_ { public: DISPE_1_0_GLOBALS_(); CORBA::Long DISPE_1_0_GLOBALS_MyGlobalParameterFunction(char* arg0); CORBA::Long DISPE_1_0_GLOBALS_MyGlobalVariable(); void DISPE_1_0_GLOBALS_MyGlobalVariable(CORBA::Long x); };

And the corresponding tdserver.cc le looks like the following.

87

#include "DISPE_1_0_testlib.h" #include "newtest.h" #include "testlib.h" #include "tdserver.h" DISPE_1_0_GLOBALS_::DISPE_1_0_GLOBALS_() { } CORBA::Long DISPE_1_0_GLOBALS_:: DISPE_1_0_GLOBALS_MyGlobalParameterFunction(char* arg0) { return MyGlobalParameterFunction(arg0); } CORBA::Long DISPE_1_0_GLOBALS_::DISPE_1_0_GLOBALS_MyGlobalVariable() { return MyGlobalVariable; } void DISPE_1_0_GLOBALS_::DISPE_1_0_GLOBALS_MyGlobalVariable(CORBA::Long x) { MyGlobalVariable = x; } class DISPE_1_0_testlib_impl : virtual public POA_DISPE_1_0_testlib { public: DISPE_1_0_testlib_impl (PortableServer::POA_ptr); DISPE_1_0_GLOBALS_ptr create_globals(); DISPE_1_0_MyLocalClass_ptr create_MyLocalClass (); private: PortableServer::POA_var mypoa; int ctr; }; DISPE_1_0_testlib_impl::DISPE_1_0_testlib_impl (PortableServer::POA_ptr _poa) { mypoa = PortableServer::POA::_duplicate(_poa); ctr = 0; } DISPE_1_0_GLOBALS_ptr DISPE_1_0_testlib_impl::create_globals() { char str[30]; sprintf(str, "%04dglobals", ctr++); PortableServer::ObjectId_var oid = PortableServer::string_to_ObjectId(str); CORBA::Object_var obj = mypoa->create_reference_with_id(oid, "IDL:DISPE_1_0_GLOBALS:1.0"); DISPE_1_0_GLOBALS_ptr aref = DISPE_1_0_GLOBALS::_narrow(obj); assert (!CORBA::is_nil(aref)); return aref; } DISPE_1_0_MyLocalClass_ptr DISPE_1_0_testlib_impl::create_MyLocalClass () {

88

}

char str[30]; sprintf(str, "%04dMyLocalClass", ctr++); PortableServer::ObjectId_var oid = PortableServer::string_to_ObjectId(str); CORBA::Object_var obj = mypoa->create_reference_with_id(oid, "IDL:DISPE_1_0_MyLocalClass:1.0"); DISPE_1_0_MyLocalClass_ptr aref = DISPE_1_0_MyLocalClass::_narrow(obj); assert (!CORBA::is_nil(aref)); return aref;

class ClassManager : public virtual POA_PortableServer::ServantActivator { public: PortableServer::Servant incarnate (const PortableServer::ObjectId &, PortableServer::POA_ptr); void etherealize(const PortableServer::ObjectId &, PortableServer::POA_ptr, PortableServer::Servant, CORBA::Boolean, CORBA::Boolean); }; PortableServer::Servant ClassManager::incarnate(const PortableServer::ObjectId &oid, PortableServer::POA_ptr poa) { CORBA::String_var name = PortableServer::ObjectId_to_string(oid); char* nameptr = &(name[4]); if (!strcmp(nameptr, "globals")) { DISPE_1_0_GLOBALS_ *g = new DISPE_1_0_GLOBALS_; return new POA_DISPE_1_0_GLOBALS_tie (g); } else if (!strcmp(nameptr, "MyLocalClass")) { MyLocalClass * x = new MyLocalClass; return new POA_DISPE_1_0_MyLocalClass_tie (x); } // end if } // end incarnate void ClassManager::etherealize(const PortableServer::ObjectId &oid, PortableServer::POA_ptr poa, PortableServer::Servant serv, CORBA::Boolean cleanup_in_progress, CORBA::Boolean remaining_activations) { if (!remaining_activations) { delete serv; } } int main (int argc, char* argv[]) { CORBA::ORB_var orb = CORBA::ORB_init (argc, argv, "mico-local-orb"); CORBA::Object_var poaobj = orb->resolve_initial_references("RootPOA"); PortableServer::POA_var poa = PortableServer::POA::_narrow(poaobj); PortableServer::POAManager_var mgr = poa->the_POAManager(); CORBA::PolicyList pl;

89

}

pl.length(1); pl[0] = poa->create_request_processing_policy(PortableServer::USE_SERVANT_MANAGER); PortableServer::POA_var mypoa = poa->create_POA("MyPOA", mgr, pl); ClassManager * cm = new ClassManager; PortableServer::ServantManager_var cmref = cm->_this(); mypoa->set_servant_manager (cmref); DISPE_1_0_testlib_impl * cf = new DISPE_1_0_testlib_impl (mypoa); PortableServer::ObjectId_var oid = poa->activate_object(cf); mgr->activate(); orb->run(); poa->destroy(TRUE, TRUE); delete cm; delete cf; return 0;

Although it may be good to generate some comments, an explanation will have to do. Since the original library already contains class de nitions for MyLocalClass, class implementations for DISPE 1 0 GLOBALS and DISPE 1 0 testlib only need to be generated. DISPE 1 0 GLOBALS simply calls the original global function for the function MyGlobalParameterFunction, and it also generates two methods MyGlobalVariable() and MyGlobalVariableCORBA::Long for getting and setting the global variable MyGlobalVariable, respectively. Notice that this variable's type has been converted to CORBA::Long for CORBA compatibility. The class DISPE 1 0 testlib impl implements the factory mentioned earlier. It consists of two methods to create references for the two types of classes available in the library. These methods are called from the client library in the constructor of the corresponding classes in order to retrieve a new reference for each of these classes. The ClassManager is a servant activator which performs the actual activation, or instantiation, of the objects. So even though a new object reference may be created by the factory, until a method of that class's reference is called, it is not instantiated. The main method of the server performs the \standard" setup procedure to create the ORB and start the server.

90 The following code is tmp/testlib.h, which is simply a slightly modi ed version of the original testlib.h. #include "DISPE_1_0_testlib.h" #ifndef TESTLIB_H #define TESTLIB_H #include extern int MyGlobalVariable; class MyLocalClass { public: DISPE_1_0_MyLocalClass_var my_MyLocalClass_node; public: // Constructor MyLocalClass(); // Destructor ~MyLocalClass(); void SetValue(int); int GetValue(); private: int MyValue; }; #endif /* TESTLIB_H */

The only change is that a public member my MyLocalClass node of type DISPE 1 0 MyLocalClass var was inserted. This variable is used to hold the reference for the remote object and invokes the remote methods when it is called locally. This can be clari ed by looking at the source for newLib.cc, as follows. #include "DISPE_1_0_testlib.h" #include "newtest.h" #include "testlib.h" CORBA::ORB_var global_orb = NULL; CORBA::Object_var global_obj = NULL; DISPE_1_0_GLOBALS_var globals_variable = NULL; DISPE_1_0_testlib_var factory_var = NULL;

91 void getGlobalVariables(); void setGlobalVariables(); void doInit() { if (!CORBA::is_nil(global_orb)) return; int argc = 3; char* argv[] = {"MY_DISPE_1_0_PROG", "-ORBBINDAddr", "inet:geometry:9001"}; global_orb = CORBA::ORB_init(argc, argv, "mico-local-orb"); global_obj = global_orb->bind("IDL:DISPE_1_0_testlib:1.0", "inet:geometry:9001"); if (CORBA::is_nil(global_obj)) { printf("problem binding to globals!"); exit(1); } factory_var = DISPE_1_0_testlib::_narrow(global_obj); if (CORBA::is_nil(factory_var)) { printf("problem narrowing to testlib!"); exit(1); } globals_variable = factory_var->create_globals(); if (CORBA::is_nil(globals_variable)) { printf("problem narrowing to globals!"); exit(1); } getGlobalVariables(); } int MyGlobalParameterFunction(char* arg0) { doInit(); setGlobalVariables(); int myret; myret = globals_variable->DISPE_1_0_GLOBALS_MyGlobalParameterFunction(arg0); getGlobalVariables(); return myret; } MyLocalClass::MyLocalClass() { doInit(); my_MyLocalClass_node = factory_var->create_MyLocalClass(); if (CORBA::is_nil(my_MyLocalClass_node)) { printf("problem narrowing to my_MyLocalClass_node!"); exit(1); } } MyLocalClass::~MyLocalClass() {} void MyLocalClass::SetValue(int arg0) { my_MyLocalClass_node->SetValue(arg0); };

92 int MyLocalClass::GetValue() { return my_MyLocalClass_node->GetValue(); }; int MyGlobalVariable; void getGlobalVariables() { MyGlobalVariable = globals_variable->DISPE_1_0_GLOBALS_MyGlobalVariable(); } void setGlobalVariables() { globals_variable->DISPE_1_0_GLOBALS_MyGlobalVariable(MyGlobalVariable); }

As can be seen, the member functions of MyLocalClass simply make calls to my MyLocalClass node's corresponding member function. All of the initialization for CORBA is performed in doInit(), which is called either in a class's constructor or when a global function is called.

6.6 Library Registration Now in order to make this new library available to the Compilation Agents, the Registration Tool should be used, which will list the groups of libraries and header les currently made available on the local system. Let us assume we have no libraries or include les registered yet. The Add button will bring up a new window, from which we select the \New Group with sparc solaris" option. The File/Dir button will then bring up a Java File Dialog box like Figure 30. The user selects the CORBAized library just generated and clicks the OK button. The Library radio button should be selected and a version number entered. After pressing the Register button and replying Yes to the \Are you sure?" question, we get a listing like Figure 31. The new info now looks like Figure 32.

93

Figure 30: File Dialog The generated include les that belong to this library are also registered by rst selecting the new group just added by the previous step and clicking the Add again. After pressing the File/Dir button, this time we select the directory where the include les are located. After hitting OK, the Include radio button should be selected and the registration is completed by clicking on the Register button. So our nal listing looks like Figure 33.

Figure 31: New Library in Group 1

94

Figure 32: New Library Info

Figure 33: New Include Directory Entry This tool will generate the appropriate les in the format recognized by the compilation agents. It will also add the server code to the Implementation Repository so that when a compiled program is executed, the server will automatically start up to allow the executable to run the methods in the original library. So the next time the user starts a Context on the host, the CORBAized library and header les will be read by the Include and Library Proxy Agents and used in the compilation. Then when the user runs the executable that was compiled with these libraries, the program will automatically access the methods from the original library, even if it is running on a remote host. From these examples, it should now be clearer as to how any library can be converted to a \CORBA-ized" library, and how any library can be added to DISPE.

Chapter 7 Conclusion 7.1 Summary DISPE has great potential in paving the way for a new style of programming. Much of the latest and most solid technology in computer programming has been incorporated into the system, such as Java and CORBA. This should prove to be a plus for DISPE as it has been built on technology with a solid foundation. The use of Java-based agents is indeed an innovative concept, and one that can grow and be useful for years to come. There are no signs of the C/C++ language weakening in the future, despite the growing use of Java. Thus, the integration of two of the most popular languages today in a distributed system should be very bene cial for C/C++ programmers. Remote compilation, whether using the \Traditional" or \Agent Compilation" frees users from worrying about the setup of libraries and include les and their directories, allowing them more time to focus on programming. The remote execution is all location transparent, where groups of users at remote sites can easily demonstrate geometric algorithms to one another, and the CORBAizer is especially useful in allowing remote libraries to be accessed

95

96 during a program's execution. The CORBAizer should be very practical for users of existing legacy libraries written in C/C++. It is not easy to manually develop CORBA libraries from scratch, let alone to port a legacy library to CORBA. Thus, the CORBAizer will de nitely be useful for all programmers in any eld where there is a need to integrate legacy systems with CORBA. Finally, remote debugging capabilities, a necessity for most programmers, complete the DISPE package. No known system provides such a complete programming framework.

7.2 Future Work 7.2.1 GeoJAVASheet Because GeoJAVASheet was originally written with the JDK version 1.0, there is currently no support for 3D objects. Therefore, in the future, we would like to use Java's new 3D API and implement a 3D version of GeoJAVASheet which would be able to visualize geometric algorithms in 3D.

7.2.2 Agent Compilers In determining how we were to implement the search for hosts and libraries (\searching" for symbols is performed once an agent reaches a remote host and access a library), at rst, the lightweight directory access protocol (LDAP)[60, 59] seemed very attractive. The attractiveness was compounded by the fact that Sun Microsystems also announced the availability of the Java Naming and Directory Interface (JNDI)[57] which supports LDAP. Therefore, as more hosts and libraries are added to the system, the use of LDAP may become useful in implementing the Directories used in the system.

97 Due to the addition of CORBA into the DISPE system, we may also consider using a dierent agent system to implement the compiler agents. Although the IBM Aglets runtime environment is derived from the OMG standard Mobile Agent System Interoperability Facility (MASIF) [31], it is not a true CORBA interface. Thus, other agent systems such as Voyager may be considered to replace the Aglets in DISPE since they inherently provide such services as directory support (in which case we would not even need to use LDAP). As in any distributed system, the issue of security needs to be addressed, especially in the compilation and execution of programs that use foreign libraries. Both the code being compiled and the libraries registered in the system need to be checked for any potentially dangerous code such as system calls. In approaching this issue, agents may use a little more intelligence in compiling the user's code, perhaps by detecting any potentially dangerous methods and checking with the user if the detected code should be compiled into the program. A similar procedure may be followed during the registration and/or CORBAization of libraries.

7.2.3 The CORBAizer The CORBAizer that has been implemented generates server and client code that is compatible with MICO's CORBA implementation. So future work can be put into generating code for other CORBA implementations as well.

7.2.4 Debugger As mentioned earlier, the current remote debugger is simply a debugger located on the web server for use by a single user at a time. Multiple instances of a debugger can run simultaneously, but there currently is no functionality to allow multiple users to access the same debugger at the same time, as there is for the visualization

98 tool. This kind of functionality may be useful in the future, however, in cases such as when a group of users is working on a project. Albeit this feature may not be an urgent one, it may be an interesting project to pursue in the future.

7.2.5 Registration Tool The Registration Tool currently reads three sets of les to determine the libraries and include les available to the compilation agents. However, in the case when large lets of libraries and include les are residing on a system, it may be more ecient to use a database. Therefore, the Registration Tool may be replaced by a database in the future.

7.3 Concluding Remarks At the start of this project, as with most any project, the idea was simpler than the reality. The limitations of the various software currently available forced us to think of workarounds for many portions of the project that could have been straightforward if the software were \bug-free." However, we were still able to complete a working prototype of a distributed programming environment that provides most all aspects of programming including a distributed compiler, remote debugger, and even a visualization system, all accessible via the World Wide Web. CORBA provided unexpected assistance in the implementation of this system with its naturally distributed features. The authors' lack of experience in CORBA may have been the biggest hurdle to overcome in implementing the CORBAizer. Starting from deciding which CORBA implementation to use, then reading through the 962-page CORBA speci cation was no easy task. In addition, various CORBA vendors provide slightly dierent APIs depending on the degree of detail given in

99 the speci cation. However, with the assistance of CORBA experts from newsgroups and mailing lists, we were able to develop a working CORBAizer, which should prove to be quite useful in many areas where users wish to integrate legacy systems into a CORBA environment.

Bibliography [1] D. Abramson, I. Foster, J. Michalakes, R. Sosic, Relative Debugging: A new paradigm for debugging scienti c applications, the Communications of the ACM, Vol. 39, No. 11, pp. 67-77, Nov. 1996. [2] S.P. Amarasinghe, J.M. Anderson, M.S. Lam, C.W. Tseng, The SUIF Compiler for Scalable Parallel Machines, Proceedings of the Seventh SIAM Conference on Parallel Processing for Scienti c Computing, February, 1995. [3] K. Aoki, \The Prototyping of GeoManager: A Geometric Algorithm Manipulation System", Master's Thesis, Dept. EE/CS, Northwestern University, December, 1995. [4] K.F. Aoki, D. T. Lee, Towards Web-Based Computing, Department of Electrical and Computer Engineering, Technical Report, Northwestern University, January, 1998. [5] J. E. Baker, I. F. Cruz, L. D. Lejter, G. Liotta, and R. Tamassia, \Mocha", http://loki.cs.brown.edu:8080/pages/MochaFS.html. [6] L. Beca, G. Cheng, G.C. Fox, T. Jurga, K. Olszewski, M. Podgorny, P. Sokolowski, K. Walczak, Web Technologies for Collaborative Visualization and Simulation, NPAC Technical Report SCCS-786, Syracuse University, NPAC, Syracuse, NY, submitted January 6, 1997. [7] M.H. Brown, M.A. Najork, R. Raisamo, A Java-Based Implementation of Collaborative Active Textbooks, in 1997 IEEE Symposium on Visual Languages, pages 372-379. IEEE Computer Society, September 1997. [8] M.H. Brown, R. Sedgewick, A System for Algorithm Animation, Computer Graphics, 18(3), 177-186, July 1984. [9] P.A. Buhr, M. Karsten, J. Shih, A Multi-threaded Debugger for Multi-threaded Applications. In Proceedings of SPDT '96: SIGMETRICS Symposium on Parallel and Distributed Tools, pp. 80-87, Philadelphia, Pennsylvania, May, 1996. 100

101 [10] P.A. Buhr, R.A. Stroobosscher, C++ Annotated Reference Manual, Version 4.6. Technical report, Department of Computer Science, University of Waterloo, Waterloo, Ontario, Canada, N2L 3G1, July 1996. [11] S. Chamberlain, libbfd - The Binary File Descriptor Library, First Edition, Cygnus Support, April 1991. [12] L. Daigle, S. Newell, Intelligent Agents and the Internet Information Infrastructure, paper presented at the Inet '96 conference, Montreal, Canada, June, 1996. [13] O. Etzioni, D. Weld, A Softbot-Based Interface to the Internet, Communications of the ACM, July, 1994. [14] U. Gall, F.J. Hauck, Promondia: A Java-Based Framework for Real-Time Group Communication in the Web, Sixth International World Wide Web Conference, 1996. [15] R. Gordon, Essential JNI: Java Native Interface, Prentice-Hall, Inc., 1998. [16] S. Grabner, J. Volkert, A Debugger for Distributed Memory Machines, Proc. Workshop on Parallel Processing, Lessach, Austria, September, 1993. [17] M. Henning, S. Vinoski, Advanced CORBA Programming with C++, AddisonWesley Longman, Inc., 1999. [18] HyperParallel Technologies, Palaiseau, France, HyperC Documentation Kit, 1993. [19] J. Kempf, P. B. Kessler, Cross-Address Space Dynamic Linking, Sun Microsystems Laboratories, Inc., Technical Report SMLI TR-92-2, September, 1992. [20] J. Kiniry, D. Zimmerman, A Hands-On Look at Java Mobile Agents, IEEE Internet Computing, July-August, 1997, 21-30. [21] D.B. Lange, M. Oshima, Programming and Deploying Java Mobile Agents with Aglets, Addison Wesley Longman, Inc., 1998. [22] D.T. Lee, C.F. Shen, S.M. Sheu, \GeoSheet: A Distributed Visualization Tool for Geometric Algorithms", Int'l J. Computational Geometry & Applications, 8,2, April 1998, pp. 119-155. [23] P. Maes, Intelligent Software, Scienti c American, Vol. 273, No. 3, pp. 84-86, September, 1995. [24] K. Mehlhorn, S. Naher, \LEDA: a platform for combinatorial and geometric computing", Communications of the ACM, 38,1 (Jan. 1995), pp. 96-102.

102 [25] K. Mehlhorn, S. Naher, M. Seel, C. Uhrig, \The LEDA User Manual - Version 3.7", Max-Planck-Institut fur Informatik, 66123 Saarbrucken, Germany. http://www.mpi-sb.mpg.de/LEDA/MANUAL/MANUAL.html. [26] D. Milojicic, W. LaForge, D. Chauhan, Mobile Objects and Agents (MOA), Design Implementation and Lessons Learned, The 4th USENIX Conference on Object-Oriented Technologies and Systems (COOTS), Santa Fe, New Mexico, April, 1998. [27] S. Naher, \LEDA - Library of Ecient Data Types and Algorithms", MaxPlanck-institut fur informatik. Technical Report A 04/89, Universitat des Saarlandes, Saarbrucken, 1992. http://www.mpi-sb.mpg.de/LEDA/leda.html. [28] S. Naher, \The LEDA 3.0 User Manual", technischer Bericht A, Fachbereich Informatik, Universitat des Saarlandes, Saarbrucken, 1992. [29] Object Management Group. The Common Object Request Broker: Architecture and Speci cation, Revision 2.2, OMG Technical Document 98-07-01. [30] Object Management Group. 1995. ORB Portability Enhancement RFP. Framingham, MA: Object Management Group. ftp://ftp.omg.org/pub/docs/1995/95-06-26.pdf. [31] Object Management Group. 1998. Mobile Agent System Interoperability Facility - OMG Revision Task Force Document. Framingham, MA: Object Management Group. ftp://ftp.omg.org/pub/docs/orbos/98-03-09.pdf. [32] M. H. Overmars, \Designing the Computational Geometry Algorithms Library CGAL." In Proc. Workshop on Applied Computational Geometry, May 27-28, 1996, Philadelphia, Pennsylvania, pp. 113-119. [33] Solaris Linker and Libraries Guide, Sun Microsystems, Inc., 1997. http://docs.sun.com:80/ab2/coll.45.4/LLM/@Ab2TocView? [34] B. V. Smith, The X g User Manual, 1993. [35] R.M. Stallman, R.H. Pesch, Debugging with GDB. Free Software Foundation, 675 Massachusetts Avenue, Cambridge, MA, 02139, 1995. [36] R.M. Stallman, Using and Porting GNU CC, Version 2.8.1. Free Software Foundation, 59 Temple Place - Suite 330, Boston, MA, 02111-1307, 1998. [37] B. Stroustrup, The C++ Programming Language, Third Edition, AddisonWesley Longman, Inc., 1997.

103 [38] C.M. Wang, C.C. Hsu, Y.S. Kuo, Class Exerciser: a Basic Tool for ObjectOriented Development, Proceedings of the 1995 Asia-Paci c Software Engineering Conference, pp. 108-116, Brisbane, Australia, December 1995. [39] T.R. Chuang, Y.S. Kuo, C.M. Wang, Non-Intrusive Object Introspection in C++ { Architecture and Application, Proceedings of the 20th International Conference on Software Engineering, pp. 312-321, Kyoto, Japan, April 1998. [40] D. Wong, N. Paciorek, T. Walsh, J. DiCelie, M. Young, B. Peet, Concordia: An Infrastructure for Collaborating Mobile Agents, In Proceedings Mobile Agents, First International Workshop, April, 1997, Berlin, Germany, pp. 86-97. [41] ACE. http://www.cs.wustl.edu/~schmidt/ACE.html. [42] ADAPTIVE. http://www.cs.wustl.edu/~schmidt/adaptive.html. [43] The Agent Society. http://www.agent.org/. [44] The Alpha-shape Demo. http://www.alphashapes.org/alpha/demo.html. [45] Autonomy Knowledge Management Products. http://www.agentware.com/knowledge/index.html. [46] CORBA. http://www.omg.org/corba/whatiscorba.html. [47] CorbaScript. http://corbaweb.li .fr/CorbaScript/index.html. [48] CorbaWeb. http://corbaweb.li .fr/index.html. [49] Cetus Links - CORBA. http://www.cetus-links.org/oo corba.html#oo corba projects. [50] DOMIS. http://www.mitre.org/research/domis/index.html. [51] GeoMAMOS. http://www.ece.nwu.edu/~theory/geomamos.html. [52] The Geometry Applet. http://aleph0.clarku.edu/~djoyce/java/Geometry/Geometry.html. [53] GeomNet. http://www.cgc.cs.jhu.edu/geomNet/. [54] GOODE. http://corbaweb.li .fr/GOODE/index.html. [55] Habanero. http://havefun.ncsa.uiuc.edu/Habanero/. [56] IONA Orbix. http://www.iona.com/products/orbixchoice.html. [57] Java Naming and Directory Interface (JNDI). http://java.sun.com/products/jndi/.

104 [58] KYMA. http://www.info.unicaen.fr/~serge/kyma.html. [59] An LDAP Roadmap & FAQ. http://www.kingsmountain.com/ldapRoadmap.shtml. [60] Lightweight Directory Access Protocol (LDAP) FAQ. http://www.critical-angle.com/ldapworld/ldapfaq.html. [61] LiveAgent. http://www.agentsoft.com/liveagent.html. [62] MICO. http://www.mico.org/. [63] ModeMap. http://www.iinet.com.au/~watson/modemap.html. [64] ORBacus. http://www.ooc.com/ob/. [65] Cetus Links - CORBA - ORBs. http://www.cetus-links.org/oo object request brokers.html. [66] RFC 2109 - HTTP State Management Mechanism. http://www.cis.ohio-state.edu/htbin/rfc/rfc2109.html [67] Software Agents Group. http://agents.www.media.mit.edu/groups/agents/. [68] TAO. http://www.cs.wustl.edu/~schmidt/TAO.html. [69] TCL Plugin (Tclet). http://www.scriptics.com/plugin/. [70] UNIX FAQ. http://www.cis.ohio-state.edu/hypertext/faq/usenet/unix-faq/faq/top.html. [71] VoroGlide. http://wwwpi6.fernuni-hagen.de/java/anja/index.html.en. [72] Voyager. http://www.objectspace.com/voyager. [73] Inprise VisiBroker. http://www.inprise.com/visibroker/.

Appendix A libbfd This section lists the code for the three main data structures as declared in the original libbfd library. This is to give the reader an idea of how BFDs are structured to keep track of object les and archives, and to illustrate how each data structure corresponds with each other.

A.1

bfd

The bfd structure looks like the following: struct _bfd { /* The filename the application opened the BFD with. CONST char *filename; /* A pointer to the target jump table. const struct bfd_target *xvec;

*/

*/

/* To avoid dragging too many header files into every file that includes `', IOSTREAM has been declared as a "char *", and MTIME as a "long". Their correct types, to which they are cast when used, are "FILE *" and "time_t". The iostream is the result of an fopen on the filename. However, if the BFD_IN_MEMORY flag is set, then iostream is actually a pointer to a bfd_in_memory struct. */ PTR iostream;

105

106 /* Is the file descriptor being cached? That is, can it be closed as needed, and re-opened when accessed later? */ boolean cacheable; /* Marks whether there was a default target specified when the BFD was opened. This is used to select which matching algorithm to use to choose the back end. */ boolean target_defaulted; /* The caching routines use these to maintain a least-recently-used list of BFDs */ struct _bfd *lru_prev, *lru_next; /* When a file is closed by the caching routines, BFD retains state information on the file here: */ file_ptr where; /* and here: (``once'' means at least once) */ boolean opened_once; /* Set if we have a locally maintained mtime value, rather than getting it from the file each time: */ boolean mtime_set; /* File modified time, if mtime_set is true: */ long mtime; /* Reserved for an unimplemented file locking extension.*/ int ifd; /* The format which belongs to the BFD. (object, core, etc.) */ bfd_format format; /* The direction the BFD was opened with*/ enum bfd_direction {no_direction = 0, read_direction = 1, write_direction = 2, both_direction = 3} direction; /* Format_specific flags*/

107 flagword flags; /* Currently my_archive is tested before adding origin to anything. I believe that this can become always an add of origin, with origin set to 0 for non archive files. */ file_ptr origin; /* Remember when output has begun, to stop strange things from happening. */ boolean output_has_begun; /* Pointer to linked list of sections*/ struct sec *sections; /* The number of sections */ unsigned int section_count; /* Stuff only useful for object files: The start address. */ bfd_vma start_address; /* Used for input and output*/ unsigned int symcount; /* Symbol table for output BFD (with symcount entries) */ struct symbol_cache_entry **outsymbols; /* Pointer to structure which contains architecture information*/ const struct bfd_arch_info *arch_info; /* Stuff only useful for archives:*/ PTR arelt_data; struct _bfd *my_archive; /* The containing archive BFD. */ struct _bfd *next; /* The next BFD in the archive. */ struct _bfd *archive_head; /* The first BFD in the archive. */ boolean has_armap; /* A chain of BFD structures involved in a link. struct _bfd *link_next;

*/

/* A field used by _bfd_generic_link_add_archive_symbols. be used only for archive elements. */ int archive_pass; /* Used by the back end to hold private data. */ union { struct aout_data_struct *aout_data; struct artdata *aout_ar_data;

This will

108 struct _oasys_data *oasys_obj_data; struct _oasys_ar_data *oasys_ar_data; struct coff_tdata *coff_obj_data; struct pe_tdata *pe_obj_data; struct xcoff_tdata *xcoff_obj_data; struct ecoff_tdata *ecoff_obj_data; struct ieee_data_struct *ieee_data; struct ieee_ar_data_struct *ieee_ar_data; struct srec_data_struct *srec_data; struct ihex_data_struct *ihex_data; struct tekhex_data_struct *tekhex_data; struct elf_obj_tdata *elf_obj_data; struct nlm_obj_tdata *nlm_obj_data; struct bout_data_struct *bout_data; struct sun_core_struct *sun_core_data; struct trad_core_struct *trad_core_data; struct som_data_struct *som_data; struct hpux_core_struct *hpux_core_data; struct hppabsd_core_struct *hppabsd_core_data; struct sgi_core_struct *sgi_core_data; struct lynx_core_struct *lynx_core_data; struct osf_core_struct *osf_core_data; struct cisco_core_struct *cisco_core_data; struct versados_data_struct *versados_data; struct netbsd_core_struct *netbsd_core_data; PTR any; } tdata; /* Used by the application to hold private data*/ PTR usrdata;

};

/* Where all the allocated stuff under this BFD goes. This is a struct objalloc *, but we use PTR to avoid requiring the inclusion of objalloc.h. */ PTR memory;

A.2

asection

This is the asection data structure. typedef struct sec { /* The name of the section; the name isn't a copy, the pointer is the same as that passed to bfd_make_section. */ CONST char *name;

109 /* Which section is it; 0..nth.

*/

int index; /* The next section in the list belonging to the BFD, or NULL. */ struct sec *next; /* The field flags contains attributes of the section. Some flags are read in from the object file, and some are synthesized from other information. */ flagword flags; #define SEC_NO_FLAGS

0x000

/* Tells the OS to allocate space for this section when loading. This is clear for a section containing debug information only. */ #define SEC_ALLOC 0x001 /* Tells the OS to load the section from the file when loading. This is clear for a .bss section. */ #define SEC_LOAD 0x002 /* The section contains data still to be relocated, so there is some relocation information too. */ #define SEC_RELOC 0x004 #if 0 /* Obsolete ? */ #define SEC_BALIGN 0x008 #endif /* A signal to the OS that the section contains read only data. */ #define SEC_READONLY 0x010 /* The section contains code only. */ #define SEC_CODE 0x020 /* The section contains data only. */ #define SEC_DATA 0x040 /* The section will reside in ROM. */ #define SEC_ROM 0x080 /* The section contains constructor information. This section type is used by the linker to create lists of constructors and destructors used by . When a back end sees a symbol which should be used in a constructor list, it creates a new section for the type of name (e.g., ), attaches

110 the symbol to it, and builds a relocation. To build the lists of constructors, all the linker has to do is catenate all the sections called and relocate the data contained within - exactly the operations it would peform on standard data. */ #define SEC_CONSTRUCTOR 0x100 /* The section is a constuctor, and should be placed at the end of the text, data, or bss section(?). */ #define SEC_CONSTRUCTOR_TEXT 0x1100 #define SEC_CONSTRUCTOR_DATA 0x2100 #define SEC_CONSTRUCTOR_BSS 0x3100 /* The section has contents - a data section could be | ; a debug section could be */ #define SEC_HAS_CONTENTS 0x200 /* An instruction to the linker to not output the section even if it has information which would normally be written. */ #define SEC_NEVER_LOAD 0x400 /* The section is a COFF shared library section. This flag is only for the linker. If this type of section appears in the input file, the linker must copy it to the output file without changing the vma or size. FIXME: Although this was originally intended to be general, it really is COFF specific (and the flag was renamed to indicate this). It might be cleaner to have some more general mechanism to allow the back end to control what the linker does with sections. */ #define SEC_COFF_SHARED_LIBRARY 0x800 /* The section contains common symbols (symbols may be defined multiple times, the value of a symbol is the amount of space it requires, and the largest symbol value is the one used). Most targets have exactly one of these (which we translate to bfd_com_section_ptr), but ECOFF has two. */ #define SEC_IS_COMMON 0x8000 /* The section contains only debugging information. For example, this is set for ELF .debug and .stab sections. strip tests this flag to see if a section can be discarded. */ #define SEC_DEBUGGING 0x10000 /* The contents of this section are held in memory pointed to by the contents field. This is checked by bfd_get_section_contents, and the data is retrieved from memory if appropriate. */ #define SEC_IN_MEMORY 0x20000

111

/* The contents of this section are to be excluded by the linker for executable and shared objects unless those objects are to be further relocated. */ #define SEC_EXCLUDE 0x40000 /* The contents of this section are to be sorted by the based on the address specified in the associated symbol table. */ #define SEC_SORT_ENTRIES 0x80000 /* When linking, duplicate sections of the same name should be discarded, rather than being combined into a single section as is usually done. This is similar to how common symbols are handled. See SEC_LINK_DUPLICATES below. */ #define SEC_LINK_ONCE 0x100000 /* If SEC_LINK_ONCE is set, this bitfield describes how the linker should handle duplicate sections. */ #define SEC_LINK_DUPLICATES 0x600000 /* This value for SEC_LINK_DUPLICATES means that duplicate sections with the same name should simply be discarded. */ #define SEC_LINK_DUPLICATES_DISCARD 0x0 /* This value for SEC_LINK_DUPLICATES means that the linker should warn if there are any duplicate sections, although it should still only link one copy. */ #define SEC_LINK_DUPLICATES_ONE_ONLY 0x200000 /* This value for SEC_LINK_DUPLICATES means that the linker should warn if any duplicate sections are a different size. #define SEC_LINK_DUPLICATES_SAME_SIZE 0x400000 /* This value for SEC_LINK_DUPLICATES means that the linker should warn if any duplicate sections contain different contents. */ #define SEC_LINK_DUPLICATES_SAME_CONTENTS 0x600000 /* This section was created by the linker as part of dynamic relocation or other arcane processing. It is skipped when going through the first-pass output, trusting that someone else up the line will take care of it later. */ #define SEC_LINKER_CREATED 0x800000 /*

End of section flags.

*/

/* Some internal packed boolean fields. /* See the vma field. */ unsigned int user_set_vma : 1;

*/

*/

112

/* Whether relocations have been processed. unsigned int reloc_done : 1;

*/

/* A mark flag used by some of the linker backends. unsigned int linker_mark : 1; /* End of internal packed boolean fields. /*

*/

*/

The virtual memory address of the section - where it will be at run time. The symbols are relocated against this. The user_set_vma flag is maintained by bfd; if it's not set, the backend can assign addresses (for example, in , where the default address for is dependent on the specific target and various flags). */

bfd_vma vma; /*

The load address of the section - where it would be in a rom image; really only used for writing section header information. */

bfd_vma lma; /* The size of the section in bytes, as it will be output. contains a value even if the section has no contents (e.g., the size of ). This will be filled in after relocation */ bfd_size_type _cooked_size; /* The original size on disk of the section, in bytes. Normally this value is the same as the size, but if some relaxing has been done, then this value will be bigger. */ bfd_size_type _raw_size; /* If this section is going to be output, then this value is the offset into the output section of the first byte in the input section. E.g., if this was going to start at the 100th byte in the output section, this value would be 100. */ bfd_vma output_offset; /* The output section through which to map on output. */ struct sec *output_section; /* The alignment requirement of the section, as an exponent of 2 e.g., 3 aligns to 2^3 (or 8). */ unsigned int alignment_power;

113

/* If an input section, a pointer to a vector of relocation records for the data in this section. */ struct reloc_cache_entry *relocation; /* If an output section, a pointer to a vector of pointers to relocation records for the data in this section. */ struct reloc_cache_entry **orelocation; /* The number of relocation records in one of the above

*/

unsigned reloc_count; /* Information below is back end specific - and not always used or updated. */ /* File position of section data

*/

file_ptr filepos; /* File position of relocation info */ file_ptr rel_filepos; /* File position of line data

*/

file_ptr line_filepos; /* Pointer to data for applications */ PTR userdata; /* If the SEC_IN_MEMORY flag is set, this points to the actual contents. */ unsigned char *contents; /* Attached line number information */ alent *lineno; /* Number of line number records

*/

unsigned int lineno_count; /* When a section is being output, this value changes as more linenumbers are written out */ file_ptr moving_line_filepos;

114 /* What the section number is in the target world

*/

int target_index; PTR used_by_bfd; /* If this is a constructor section then here is a list of the relocations created to relocate items within it. */ struct relent_chain *constructor_chain; /* The BFD which owns the section. */ bfd *owner; /* A symbol which points at this section only */ struct symbol_cache_entry *symbol; struct symbol_cache_entry **symbol_ptr_ptr; struct bfd_link_order *link_order_head; struct bfd_link_order *link_order_tail; } asection ;

A.3

asymbol

And this is asymbol. typedef struct symbol_cache_entry { /* A pointer to the BFD which owns the symbol. This information is necessary so that a back end can work out what additional information (invisible to the application writer) is carried with the symbol. This field is *almost* redundant, since you can use section->owner instead, except that some symbols point to the global sections bfd_{abs,com,und}_section. This could be fixed by making these globals be per-bfd (or per-target-flavor). FIXME. */ struct _bfd *the_bfd;

/* Use bfd_asymbol_bfd(sym) to access this field. */

/* The text of the symbol. The name is left alone, and not copied; the application may not alter it. */ CONST char *name; /* The value of the symbol.

This really should be a union of a

115 numeric value with a pointer, since some flags indicate that a pointer to another symbol is stored here. */ symvalue value; /* Attributes of a symbol: */ #define BSF_NO_FLAGS

0x00

/* The symbol has local scope; in . The value is the offset into the section of the data. */ #define BSF_LOCAL 0x01 /* The symbol has global scope; initialized data in . The value is the offset into the section of the data. */ #define BSF_GLOBAL 0x02 /* The symbol has global scope and is exported. The value is the offset into the section of the data. */ #define BSF_EXPORT BSF_GLOBAL /* no real difference */ /* A normal C symbol would be one of: , , or */ /* The symbol is a debugging record. The value has an arbitary meaning. */ #define BSF_DEBUGGING 0x08 /* The symbol denotes a function entry point. perhaps others someday. */ #define BSF_FUNCTION 0x10

Used in ELF,

/* Used by the linker. */ #define BSF_KEEP 0x20 #define BSF_KEEP_G 0x40 /* A weak global symbol, overridable without warnings by a regular global symbol of the same name. */ #define BSF_WEAK 0x80 /* This symbol was created to point to a section, e.g. ELF's STT_SECTION symbols. */ #define BSF_SECTION_SYM 0x100 /* The symbol used to be a common symbol, but now it is allocated. */ #define BSF_OLD_COMMON 0x200 /* The default value for common data. */ #define BFD_FORT_COMM_DEFAULT_VALUE 0

116 /* In some files the type of a symbol sometimes alters its location in an output file - ie in coff a symbol which is also symbol appears where it was declared and not at the end of a section. This bit is set by the target BFD part to convey this information. */ #define BSF_NOT_AT_END

0x400

/* Signal that the symbol is the label of constructor section. */ #define BSF_CONSTRUCTOR 0x800 /* Signal that the symbol is a warning symbol. The name is a warning. The name of the next symbol is the one to warn about; if a reference is made to a symbol with the same name as the next symbol, a warning is issued by the linker. */ #define BSF_WARNING 0x1000 /* Signal that the symbol is indirect. This symbol is an indirect pointer to the symbol with the same name as the next symbol. */ #define BSF_INDIRECT 0x2000 /* BSF_FILE marks symbols that contain a file name. for ELF STT_FILE symbols. */ #define BSF_FILE 0x4000 /* Symbol is from dynamic linking information. #define BSF_DYNAMIC 0x8000 /* The symbol denotes a data object. others someday. */ #define BSF_OBJECT 0x10000

This is used

*/

Used in ELF, and perhaps

flagword flags; /* A pointer to the section to which this symbol is relative. This will always be non NULL, there are special sections for undefined and absolute symbols. */ struct sec *section; /* Back end special data. union { PTR p; bfd_vma i; } udata; } asymbol;

*/

Appendix B CORBA Terminology The terminology de nitions given in this section have been taken from [17].

CORBA Client - A client is an entity that invokes a request on a CORBA object.

A client may exist in an address space that is completely separate from the CORBA object, or the client and the CORBA object may exist within the same application. The term client is meaningful only within the context of a particular request because the application that is the client for one request may be the server for another request.

CORBA Object - A CORBA object is a \virtual" entity capable of being located

by an ORB and having client requests invoked on it. It is virtual in the sense that it does not really exist unless it is made concrete by an implementation written in a programming language. The realization of a CORBA object by programming language constructs is analogous to the way virtual memory does not exist in an operating system but is simulated using physical memory.

CORBA Object Reference - An object reference is a handle used to identify, locate, and address a CORBA object. To clients, object references are opaque 117

118 entities. Clients use object references to direct requests to objects, but they cannot create object references from their constituent parts, nor can they access or modify the contents of an object reference. An object reference refers only to a single CORBA object.

CORBA Request - A request is an invocation of an operation on a CORBA object by a client. Requests ow from a client to the target object in the server, and the target object sends the results back in a response if the request requires one.

CORBA Server - A server is an application in which one or more CORBA ob-

jects exist. As with clients, this term is meaningful only in the context of a particular request.

Factory - A factory is a CORBA object that oers one or more operations to

create other objects. To create a new object, a client invokes an operation on the factory; the operation's implementation creates a new CORBA object and returns a reference for the new object to the client. Factory operations in a distributed system play the role of the constructor in C++.

Implementation Repository - An implementation repository has the following responsibilities.

It maintains a registry of known servers.

It records which server is currently running on which host and at which port number.

It starts servers on demand if they are registered for automatic start-up.

119 Each implementation repository must run as a process that listens for requests on a xed host and at a xed port number. Implementation repositories are daemon processes that are usually started by a start-up script at boot time.

Object Adapters - An object adapter serves as the glue between servants and

the ORB, and is an object that adapts the interface of one object to a dierent interface expected by a caller. In other words, it is an interposed object that uses delegation to allow a caller to invoke requests on an object without knowing the object's true interface. CORBA object adapters ful ll three key requirements.

They create object references, which allow clients to address objects.

They ensure that each target object is incarnated by a servant.

They take requests dispatched by a server-side ORB and further direct them to the servants incarnating each of the target objects.

In C++, servants are instances of C++ objects. They are typically de ned by deriving from skeleton classes produced by compiling IDL interface de nitions. To implement operations, you override virtual functions of the skeleton base class. You register these C++ servants with the object adapter to allow it to dispatch requests to your servants when clients invoke requests on the objects incarnated by those servants. Until version 2.1, CORBA contained speci cations only for the Basic Object Adapter (BOA). However, a couple problems arose with the BOA speci cation.

120

The BOA speci cation did not account for the fact that, because of their need to support servants, object adapters tend to be language-speci c. Because CORBA originally provided only a C language mapping, the BOA was written to support only C servants. Later attempts to make it support C++ servants proved to be dicult.

A number of critical features were missing from the BOA speci cation. Certain interfaces were not de ned and there were no servant registration operations. Even those operations that were speci ed contained many ambiguities. ORB vendors developed their own proprietary solutiosn to ll the gaps, resulting in poor server application portability between dierent ORB implementations.

The Portability Enhancement RFP [30] issued by the OMG in 1995 to address these issues contained a seven-page listing of problems with the BOA speci cation. CORBA version 2.2 introduced the Portable Object Adapter to replace the BOA. Because the POA addresses the full gamut of interactions between CORBA objects and programming language servants while maintaining application portability, the quality of the POA speci cation is vastly superior to that of the BOA. As a result, the BOA speci cation has been removed from CORBA.

OMG Interface De nition Language (IDL) - The IDL is a language that de-

nes the object interfaces of implemented objects. An object's interface is composed of the operations it supports and the types of data that can be passed to and from those operations. The IDL provides this information so that clients can know what objects and methods are available. Its sole

121 purpose is to allow object interfaces to be de ned in a manner that is independent of any particular programming language, so it is not a programming language per se. But this arrangement is what allows applications implemented in dierent programming languages to interoperate.

Servant - A servant is a programming language entity that implements one or more CORBA objects. Servants are said to incarnate CORBA objects because they provide bodies, or implementations, for those objects. Servants exist within the context of a server application. In C++, servants are object instances of a particular class.

Servant Manager - A servant manager is a callback object that can assist the

POA in obtaining a servant for an object. Oftentimes explicit servant registration may be extremely expensive or even impossible, in cases such as when an application contains thousands of objects, or when an application functions as a gateway to another distributed system and thus might have to learn of the presence of objects dynamically. There are currently two types of servant managers, ServantActivator and ServantLocator. One dierence between these two types of servant managers is that servant locators will most likely invoke a servant for a single request, performing some preinvocation and post-invocation operations immediately before and after the request, whereas servant activators will incarnate an object when a request is rst received for an object reference, and the incarnated object will persist until it is explicitly etherealized by the client.

Tie class - A tie class is a C++ class template that can be instantiated to create a

concrete servant. A tie-based servant implements all methods by delegating them to another C++ object.

Appendix C Java Programming Language The Java programming language by Sun Microsystems provides two major features that make it very applicable to distributed geometric computing. They are sockets and GUI objects. By simply declaring a new ServerSocket() data object in a server application, client applications can begin communicating to it by using a Socket() class, declared similarly, without worrying about the type of system on which the applet or application may be running. GUI objects such as buttons, canvases and panels can also be created easily with prede ned classes provided by the Java library. Both datagram (User Datagram Protocol or UDP) and stream-based (Transmission Control Protocol, or TCP) sockets are provided in the Java application programming interface (API). However, security issues prevent applets from waywardly creating sockets on users' machines; sockets can only be created on the host that provided the applet. Therefore, if a Java application is running on the server, another applet cannot create a socket on the remote host to even connect back to the server application. Datagram sockets require such a con guration. Although Java applications (as opposed to applets), would work without a problem, that 122

123 would defeat the purpose of allowing users to easily access the system without having to download the application itself. Stream-based sockets, however, can be used in an applet where the TCP socket is created on the server and the applet communicates directly to that port. Therefore, using TCP sockets, applets can be easily created that provide distributed geometric computing. In addition to the language limitations, TCP is favorable because of its stability, especially in large networks. UDP packets are not \acknowledged" by the recipient, so the farther the distance between the sender and receiver, the more prone the packet is to get lost. This often results in the user's algorithm \hanging" during execution, without any means of recovering itself. The user is unfortunately forced to kill the execution of the algorithm in this case. Adding error checking packets for acknowledgements would most likely only increase the number of lost packets over the network. We create a GeoLIB library that supports TCP messaging. Since setup and disconnect packets are not used for each message sent (it is only required upon connection/disconnection to/from the system), the packet sizes are smaller, and TCP's reliability prevents the transmission speed from getting degraded as much, compared to the UDP transmission protocol.

A Web-Based Distributed Programming Environment - CiteSeerX

A Web-Based Distributed Programming Environment - CiteSeerX

Suggest Documents

A Web-Based Distributed Programming Environment - CiteSeerX

Distributed Object Programming Environment for Smart ... - CiteSeerX

Distributed Object Programming Environment for Smart ... - CiteSeerX

A Programming Environment for Heterogeneous Distributed Memory ...

Distributed eXtreme Programming - CiteSeerX

Duplex: A Distributed Collaborative Editing Environment ... - CiteSeerX

A Learner-Centered Distributed Learning Environment - CiteSeerX

DIPC: A Heterogeneous Distributed Programming System - CiteSeerX

DIPC: A Heterogeneous Distributed Programming System - CiteSeerX

A Distributed Object-Oriented Genetic Programming ... - CiteSeerX

extreme programming in a university environment - CiteSeerX

NXLib | A Parallel Programming Environment for ... - CiteSeerX

1980 - Descriptions for a Programming Environment - CiteSeerX

P-GRADE: a Grid Programming Environment - CiteSeerX

RAPTOR: A Visual Programming Environment for ... - CiteSeerX

A Generic Programming Environment for High ... - CiteSeerX

A Video-enhanced Environment for Distributed Extreme Programming

Programming for Distributed Computing - CiteSeerX

Parallel Distributed Genetic Programming - CiteSeerX

A comparison on Distributed Systems Environment and Programming ...

A Programming Environment for Ubiquitous Computing Environment

Millipede - A Programming Environment providing

EVA: Collaborative Distributed Learning Environment ... - CiteSeerX

A Partitioning Programming Environment for a Novel ... - CiteSeerX