Toward Automated Support for Transparent ...

6 downloads 30558 Views 108KB Size Report
existing or legacy OODBs with new applications that require Java. ... that allow software developers to easily integrate Java applications with non-Java OODBs.
Toward Automated Support for Transparent Interoperable Queries Alan Kaplan Bradley Schmerl Rajesh Veeraraghavan∗ Department of Computer Science Clemson University Clemson, SC 29634 USA {kaplan,schmerl,rajesh}@cs.clemson.edu Abstract Many object-oriented databases (OODBs) are based on programming languages (such as C++, CLOS and Smalltalk) that pre-date Java. Given Java’s rapid emergence, there is a growing need for interoperating existing or legacy OODBs with new applications that require Java. Unfortunately, there are few mechanisms that allow software developers to easily integrate Java applications with non-Java OODBs. Although various interoperability mechanisms have been developed, these approaches have some significant drawbacks in practice. They are often difficult to use, provide little, if any, automated support, and produce software that is difficult to engineer and maintain. In this paper, we describe ongoing work in providing interoperability support that allows application developers to seamlessly and transparently access non-Java OODBs from Java applications. Our approach involves embedding statements written in an object query language (called JOQL) into Java applications that are used to query C++-based OODBs. We also describe an accompanying toolset that processes Java programs containing JOQL queries. The toolset produces the necessary code allowing Java applications to access and manipulate a C++-based OODB. As a result, application developers are free to work in Java without having to concern themselves with the details of interoperating with C++. Finally, we provide some preliminary experimental data that demonstrates our approach incurs a modest performance overhead.

1

Introduction

In recent years, Java has emerged as a popular choice for developing many new applications. Not surprisingly, Java-based object oriented databases (OODBs) are beginning to appear in both commercial and research arenas. However, many OODBs are based on programming languages (such as C++, CLOS and Smalltalk) that pre-date Java and, despite Java’s increasing popularity, support for these “older” OODBs continues today. This situation has resulted in a serious problem for OODB technology, specifically, how to interoperate new applications with existing or legacy OODBs with applications that require Java. This problem can be addressed by interoperability mechanisms that facilitate the integration and access of data and operations described in different languages. An aim of this work is to hide the fact that interoperation between languages is occurring, i.e., to hide the fact that the notation used to access and manipulate persistent data (e.g., Java) is different than the notation used to originally create and store the data (e.g., C++). Unfortunately, contemporary interoperability mechanisms are difficult to use and applications that require their use are difficult to engineer and maintain. Present approaches to interoperability demand far too much programmer involvement in low level details to be appealing to most software developers. Although some of these approaches provide a modicum of automated support, in general they are not well integrated and require manual intervention, thus making them tedious to use and ∗

Rajesh Veeraraghavan ([email protected]) is currently employed by Microsoft Corporation, Redmond, WA.

prone to error. As a result, they force software developers to waste valuable time dealing with the complexities of a particular interoperability mechanism, instead of focusing on a problem domain. (See [1, 6] for a more detailed assessment of current interoperability approaches.) In this paper, we describe an interoperability approach that provides application developers seamless and transparent access to non-Java object-oriented databases (OODBs) from Java applications. Our approach centers around an extension to Java called JOQL (Java Object Query Language), a declarative language that is used to embed queries in Java applications. An accompanying toolset processes JOQL queries and generates code enabling Java applications to seamlessly execute those queries over C++-based OODBs. Although the objects that reside in the database are defined in C++, the Java application views and manipulates the objects as if they were defined in Java. As a result, application developers are free to work in Java without having to concern themselves with the details of interoperating with C++. The remainder of the paper is organized as follows. In Section 2 we describe the syntax and semantics for JOQL. We then describe the architecture of the JOQL toolset in Section 3, followed by a preliminary performance evaluation in Section 4. Section 5 summarizes the current status of our work and discusses some future research directions.

2

Java OQL Syntax and Semantics

The current version of JOQL is based on the Open OODB C++ OQL [9] and is a subset of the ODMG OQL specification [4]. Object query languages in general are declarative languages for querying and updating OODBs. Using JOQL, programmers can embed declarative queries inside method bodies within Java applications. The syntax for a JOQL query can then be checked statically. This contrasts with other approaches (such as JavaBlend and JDBC, described in [2]), that involve dynamically constructing a query as a string and having it interpreted by the database. Figure 1(a) shows the syntax for JOQL queries. A JOQL query is specified in a query block. A query block begins with the keyword Query and the query body is enclosed in subsequent matching braces. Capitalized symbols represent keywords. The square brackets indicate special clauses inside the query. Each query is executed in a single transaction. Query { [result] = SELECT [object] FROM [range variable] IN [collection] WHERE [predicate]; }

(a) Syntax

Set newCars; Query { newCars = SELECT matchCar FROM Car matchCar IN fleet WHERE matchCar.ageOf () < 10; }

(b) Example

Figure 1: Java Object Query Language A query returns a collection of objects. In Figure 1(a), the symbol [result] is a Java reference to a collection of objects after a query is executed. This reference must be declared as a Java Set1 and appear prior to the query block. The type of the individual objects contained by the collection is determined by the SELECT clause. In particular, the [object] symbol is a Java reference to any object that satisfies a query. The FROM clause specifies the range variables for a query and identifies the collection of objects over which the query should be performed. A [range variable] is represented using a Java reference declaration.2 A [collection] is a name of a persistent collection of 1 2

Our toolset supplies a Set class that is a collection of Java objects. The specification of [object] and [range variable] is, in fact, somewhat redundant. It is included in the JOQL

Java OODB Car current; Set newCars; Query { newCars = Select matchCar FROM Car matchCar IN fleet WHERE matchCar.ageOf() < 10; } ... current = newCars.elementAt (0); int age = current.ageOf ();

query collection Java - C++ Interoperability Bridge access object C++ OODB

JOQL Application

Figure 2: Using JOQL to Query an OODB objects that reside in a database. Finally, the WHERE clause defines a predicate whose property is to be satisfied by objects found by a query. A [predicate] can be any legal Java boolean condition that consists of an object method invocation, a comparison operator and a literal value. To help illustrate JOQL, consider the Java code fragment shown in Figure 1(b), which retrieves Car objects that are less than ten years in age. In this query, the reference variable newCars refers to a set of Cars returned by the query. Note that this reference is declared prior to the query block. Within the query block, the reference variable matchCar is declared in the range variable specification Car matchCar. The identifier fleet indicates the name of the collection over which the query will be performed, i.e., a collection of Car objects. Finally, the predicate matchCar.ageOf()< 10 uses the method ageOf that is defined by the Car class. As noted above, the reference variable newCars will contain a set of Car objects that satisfy the query. Although the individual Car objects are actually implemented in C++, they appear and behave as Java objects. A summary of how this is achieved is described in Section 3. Despite its limited expressive power (compared to a fully-fledged OQL), JOQL is useful for demonstrating and experimenting with our interoperability technique. Although not illustrated, JOQL supports updates in addition to queries. We also note that the JOQL definition is independent of the actual object implementation language. As discussed in greater detail in Section 5, queries over both persistent C++ and Java objects is an active area of future work.

3

The JOQL Toolset

Although the objects in the OODB are originally defined and implemented in C++, the JOQL toolset provides seamless interoperation between a Java application and a C++ OODB. What this means is that a Java developer need only be concerned with writing queries in JOQL, and manipulating all the objects (that are returned by a query) in Java. To help illustrate this concept, consider Figure 2, which is based on the example JOQL query initially described in Section 2. From a developer’s point of view, the query appears to be interacting with a Java OODB. However, the Java OODB is actually a C++ OODB, where an interoperability mechanism is responsible for communication between Java and C++. The purpose of the JOQL toolset is to translate JOQL query specifications into executable code and automate, as well as hide, the Java-C++ interoperability mechanism. The JOQL toolset consists of two primary tools: the Query Translator, which parses and translates a (legal) query into executable code, and the Java Proxy Generator, which creates Java classes, called proxies, that encapsulate both the C++ object(s) returned from a query and the objects that can be accessed, either directly or indirectly, from the returned objects. The Java Proxy Generator is similar to the J2C++ tool [5]. syntax partly to ease parsing of JOQL queries.

. #include

...

#include

...

class JOQL { Set newCars;

...

public static Set query () {

Query {

Set cars;

Set newCars;

newCars = SELECT matchCar

...

FROM Car matchCar IN fleet

newCars = JOQL.query ();

WHERE matchCar.ageOf() < 10;

...

...

Car aCar = newCars.elementAt(0);

JNIEXPORT jobject JNICALL

cars = cppQuery ();

Car aCar = newCars.elementAt(0);

Query { cs = SELECT matchCar FROM Car matchCar IN fleet

// translate "cars" to a Java collection // and return to Java application

System.loadLibrary ("JO3DB");}

}

Set_Car* cars;

...

static {

Application.java (original)

OpenOODB->beginT ();

Set_Car* cars = Joql ();

return cars;} private native Set cppQuery ();

Application.java (modified)

Set_Car* Joql () {

Java_JOQL_cppQuery (...) {

WHERE matchCar.ageOf() < 10;} OpenOODB->commit();

return (...);

return cars;

} }

JOQL.java

JavaCPP.C

JOQL.C

Figure 3: Components Produced by Query Translator

3.1

The Query Translator

As noted above, the Query Translator is responsible for translating a JOQL query into executable code that performs the query over a C++ OODB. To accomplish this, our overall approach assumes the existence of a C++ OQL. A C++ OQL is used to query C++ objects that reside in an OODB. The Query Translator transforms a JOQL query into an equivalent C++ OQL version of the JOQL query and the resulting C++ OQL query is encapsulated in a C++ function. In order for the Java application to invoke this C++ function, the original JOQL query (embedded in the Java application) is replaced with Java code that invokes the generated C++ function. Finally, the toolset automatically applies the Java Native Interface (JNI) mechanism [8] in order to generate an interoperability bridge between the Java application and the generated C++ OQL code. To gain a better understanding of the internal operation of the Query Translator, consider a C++ OODB that is populated with instances of the Car class. Suppose a software developer is writing a Java application that needs to access persistent C++ data. As shown in Figure 2, the application needs to locate all cars that have been built in the last ten years. To accomplish this task, a software developer might specify the JOQL query as shown in Figure 1(a) and then compile the query using the Query Translator. Figure 3 shows the various Java and C++ components that are produced by the Query Translator. It also shows the control flow relationships between the components, namely how the Java application actually executes the C++ OQL query. Specifically, the following components are generated by this tool:3 JOQL.C: This file contains a stand-alone C++ function, named Joql, that encapsulates the C++ OQL query. In our example, the function executes the query and returns a C++ collection of Car objects that are less than ten years in age. JOQL.java: This Java class provides an intermediary between the Java application and the C++ interoperability mechanism, to access the generated C++ function. It defines a public method that begins the query process by invoking a private native method, which in turns invokes a Java-C++ interoperability bridge. The class also defines a static initializer for loading the various (shared) C++ libraries. JavaCPP.C: A Java-C++ interoperability bridge is generated that provides a native implementation of the query for JOQL.java. It invokes the self-contained C++ OQL query that appears in JOQL.C and returns the resulting objects. This is written using the Java Native Interface (JNI). Application.java: The JOQL query appearing in the original Java application is replaced with a call to the query method defined in JOQL.java. 3 The filenames presented in this example are used for illustrative purposes only; the general tool allows an arbitrary number of queries to be specified.

After a query has been invoked, the result of the query must be returned to the Java application. A C++ OQL query returns a set of objects, implemented using a Set class defined by Open OODB. To be able to access them from Java, we need to also gather these objects into a collection. The Application.java component in Figure 3 expects the query to return a Java Set and manipulate it as a Java object. Although faced with several options, the one chosen for the prototype involves using callbacks from C++ to Java. The C++ Set returned as the result of a query is traversed and for every C++ object, a corresponding Java proxy is created (via a callback) and added to a Java Set. Referring to Figure 3, the code generated in the JavaCPP.C component calls back to the JOQL class code. This code adds each object that matches the query to a Java Set. Thus, the Java Set is a collection of Java proxy objects, and the set is returned to, and manipulated by, the application. In summary, the Query Translator converts a JOQL query into an equivalent C++ OQL query. Then, using the protocols and constructs defined by the JNI, it generates code that enables the Java application to transparently interoperate with C++. The code is produced so that a Java programmer is unaware that a call to a C++ query is being made. We note that this approach involves several levels of indirection. The first is to call the method in the Java class to initiate the query. The second is a call across the Java-C++ bridge via the JNI. Finally, this bridge calls the C++ function, which contains the generated Open OODB OQL query.

4

Evaluation

In order to evaluate this approach, we concentrated on establishing experiments measuring the performance overhead of our approach, specifically, comparing the performance of a heterogeneous system (i.e., a Java application querying a C++ OODB) with a homogeneous system (i.e., a C++ application querying a C++ OODB). This section describes some experiments we conducted using the JOQL toolset and provides a preliminary indication of the overhead of our approach. We conducted two experiments examining the response time of JOQL queries. The first experiment involved populating the Open OODB with instances of a relatively simple Car class. The reason we chose this particular experiment was because this class was relatively small and we could populate the database with many instances of it. This also allowed us to gain timing data for large sets. The JOQL query used for this experiment is shown in Figure 1(b). To ensure a uniform distribution of ages, the age values for Car objects were randomly given values between 1 and 20 (when the database was populated). The results of the JOQL query returned approximately 50% of the objects in the set being queried. In an effort to simulate more realistic data, in the second experiment we populated the database with classes derived from the OO7 Benchmark [3]. In contrast to the first experiment, the classes in this benchmark are more complex and exhibit more interesting relationships. The query we decided to implement utilizes only the AtomicPart class from the OO7 benchmark, and returns all AtomicParts with a buildDate of more than 1500. The buildDate values for AtomicParts in the database were given random values between 1000 and 1999. Again, on average, the query returned 50% of the elements in the collection. For both experiments, the JOQL toolset generated the interoperability code, including the OQL query and the Java-C++ bridge, as discussed in Section 3, in addition to the versions of the Java proxies for the Car and AtomicPart. To produce the timing data, we inserted code into the programs to commence measuring the time just prior to each query and to suspend and report timing after each query. The times recorded were in user time. We conducted each experiment on a SPARC Ultra 1 with 128Mb of memory running Solaris 2.5.1. We conducted the experiment using Java 1.1.6, the Sun Solaris C++ compiler, version 4.2,

and the TI/ARPA OpenOODB 1.0, and its associated OQL implementation. As expected, using our approach incurs a cost, especially when translating the set to Java, of between a 15%–80% decrease in speed. In another experiment that ignored the set translation (which is roughly O(n)), results were in the range of 2%–15% slower than the code written in the homogeneous scenario. This suggests that the Java-C++ bridge does not incur an unreasonable cost.

5

Conclusion

This paper describes preliminary results for a transparent and seamless approach to providing Java access to a C++ OODB. Although the implementation of this approach is ongoing, we believe that the results are encouraging enough to continue investigating the approach. A key contribution of this approach is that programs do not have to deal with primitive language interoperability mechanisms, such as the Java Native Interface, or external type definition languages, such as the ODMG’s ODL or CORBA’s IDL. This provides seamlessness and transparency when writing Java applications that make queries on a non-Java OODB. The prototype described in this paper demonstrates the feasibility of our approach for providing Java access to an existing non-Java OODB. The prototype utilizes the Open OODB and its associated C++ OQL. Our preliminary performance data indicates that our approach is reasonable, although clearly there are opportunities to improve upon our initial success. Future work will focus on generalizing and extending JOQL, experimenting with various methods of returning values, conducting more performance evaluations and increasing the generality of our proxy generators. We also plan to get leverage from the work of [1, 7] to support polylingual interoperability, i.e., not only allowing Java to access C++ objects, but also C++ to access Java objects.

References [1] D. J. Barrett, A. Kaplan, and J. C. Wileden. Automated Support for Seamless Interoperability in Polylingual Software Systems. The Fourth Symposium on the Foundations of Software Engineering (San Francisco, CA), 1996. [2] D. Barry and T. Stanienda. Solving the Java Storage Problem. IEEE Computer, 31(11):33–40, 1998. [3] M. Carey, D. Dewitt, and J. Naughton. The OO7 Benchmark. Proceedings of the 1993 ACM SIGMOD Conference (Washington, D.C.), 1993. [4] R. Cattell and D. Barry, editors. The Object Database Standard: ODMG 2.0, Series in Data Management Systems. Morgan Kaufmann, 1997. [5] M. Hubbard and A. Schade. J2C++ developer tool for integrating C++ objects with Java applets and applications, 1997. http://www.alphaworks.ibm.com. [6] A. Kaplan, J. V. Ridgway, and J. Wileden. Why IDLs are not ideal. Proceedings of the Ninth IEEE International Workshop on Software Specification and Design (Ise-Shima, Japan), 1998. [7] A. Kaplan and J. C. Wileden. Toward Painless Polylingual Persistence. Seventh International Workshop on Persistence Object Systems (Cape May, NJ), 1996. [8] Sun Microsystems, Inc. (Cupertino, CA). Java Native Interface Specification, 1997. http://java.sun.com/products/jdk/1.1/docs/guide/jni/spec/jniTOC.doc.html. [9] Texas Instruments, Inc. (Dallas, TX). Open OODB 1.0 Query Language User Manual, 1995.

Suggest Documents