Design Alternatives in the Definition of IDL Interfaces MARKUS ALEKSY Department of Information Systems University of Mannheim Schloß (L 5,6) 68131 Mannheim, Germany +49 621 181 1488
[email protected]
ABSTRACT In this paper we analyze criteria that might be useful and important for the design of the IDL interface of a CORBA-based application. We elaborate on the alternative language concepts that the IDL offers in many situations. And we point out those constructs that in most cases might be preferable in order to avoid potential design errors in the early development phases. Keywords: IDL, CORBA, value types, distributed systems
1.
INTRODUCTION
The Object Management Group (OMG) introduced the Interface Definition Language (IDL) [3] to describe the interfaces of objects that provide server functionality. IDL is a purely declarative language, i.e., it is used to define types and their interfaces in form of attributes, operations, and exceptions. No implementation, however, is specified. IDL enables language independence since only an IDL compiler translates the interfaces into the concepts of a specific programming language. In addition to the language mappings defined in the OMG standard – IDL to Ada, C, C++, COBOL, Java, Lisp, Python, and Smalltalk –, several non-standard bindings, e.g., to Eiffel, Objective-C, or Perl are supported by some Object Request Broker (ORB) vendors.
2.
ALTERNATIVES IN THE DEFINITION OF IDL INTERFACES
In many cases, IDL offers several alternative ways to define an interface with a specific semantics. When selecting one of these alternatives, different criteria such as readability, portability, or performance of the application might be essential. In the following, we will analyze some of those alternatives with respect to their advantages and drawbacks in detail.
MARTIN SCHADER Department of Information Systems University of Mannheim Schloß (L 5,6) 68131 Mannheim, Germany +49 621 181 1639
[email protected]
These are the problem domains that we will deal with in the next sections: • • • • 2.1.
Arrays vs. sequences, Attributes vs. operations, Returning results from an operation, Object vs. value types. Arrays vs. Sequences
An IDL array contains a fixed number of elements that are of any valid IDL type. In order to define an array, IDL uses the typedef keyword, the data type of the array elements, an identifier that names the array type, and the array size that is enclosed in [ and ]. An array definition without typedef or with unspecified array size would be invalid. The array size must be a positive integer constant. Multidimensional arrays may be defined by appending additional array sizes, e.g.: typedef long lma[100][20]; The alternative to using arrays is to define sequences. An IDL sequence contains a variable number of elements that are, again, of any valid IDL type. A maximum number of elements in the sequence can be specified (bounded sequence). If no maximum size is specified the sequence is unbounded and the number of elements in the sequence is only bounded by the memory available at run time. Since the elements of a sequence can be of arbitrary type, it is possible to define sequences of sequences. In that case, it is not necessary to name the sequence type of the elements via typedef (anonymous sequence), e.g.: typedef sequence ssl; The notable advantage of using unbounded sequences lies in the flexibility that their ability to contain an unfixed number of elements offers. Arrays, on the other hand, may provide better performance, e.g., when passed as a function argument or received from an operation invocation.
2.2.
Attributes vs. Operations
The IDL allows programmers to decide whether access to data elements is provided through attributes or via operations. In a purely object-oriented design, attributes are a means to describe object states and operations are used to describe object behavior. However, in any language binding, an IDL attribute x is mapped to a pair of accessor functions get_x and set_x. The single exception to this rule are readonly attributes: only the get operation is defined in that case. Therefore, instead of defining attributes that will be transformed into operations by the IDL compiler, developers might use operations from the beginning. The following two IDL interface definitions are logically equivalent. interface A { attribute string name; readonly attribute unsigned long id; }; interface B { string get_name(); void set_name(in string name); unsigned long id(); }; Here, the two name accessors must be named differently, since IDL does not allow overlading of operation names. As can be seen from the id operation, readonly attributes are treated largely analogous. Would IDL allow overloading of operation names then developers would not discover any difference when calling operations statically. An IDL-to-Java compiler generates the following code. public interface AOperations { public String name(); public void name(String val); public int id(); } public interface BOperations { public String get_name(); public void set_name(String name); public int id(); } When employing the Dynamic Invocation Interface (DII), the differences are negligible. In that case, an attribute x is accessed through • •
_get_x for retrieving the value of the attribute, _set_x for setting the value of the attribute.
For the above example, this implies that in order to retrieve the value of x via the DII one would call _get_x. In contrast to this, operations are always called with their operation name, i.e., even when using the DII one simply calls get_x. There are other advantages that speak for the definition of operations. An operation can raise a user-defined exception which the compiler-generated accessor functions for attributes cannot. Developers could, e.g., define an InvalidNameException which is raised if set_name is called with an invalid argument. Further, an operation can be specified oneway. Operations whose calls have to be propagated only in one direction (client to server) are marked oneway. Oneway operations must not contain any out or inout parameters and must specify void as return type. Especially for time-critical operations it may make sense to define them oneway since they are performed significantly faster and their call is non-blocking. Direct access to attributes is frowned upon in the objectoriented community. What could be the reasons that cause a CORBA-developer to define IDL attributes? In our opinion, these reasons are hardly convincing. The first reason might be the saving of time – only one attribute definition instead of two operation definitions. The second reason is of more semantic nature. Data elements of objects that are identified in the analysis/design model of the problem domain are mapped “correctly”. Comparing the pros and cons of both approaches, we tend to recommend the consequent usage of operations. 2.3.
Returning Results From an Operation
The IDL leaves it up to the CORBA developer to decide whether the result of an operation is passed to the client as a return result or with an out parameter. The IDL syntax supports three types of parameter declarations. • • •
in: the parameter is passed from client to server. out: the parameter is passed from server to client. inout: the parameter is passed in both directions.
out and inout parameters may be used to communicate results to the caller of an operation. We are going to elaborate on these possibilities with the help of a small time service application. This simple example should not be compared to the Time Service that the OMG specified in the CORBAservices framework [4]. The following IDL specification contains the definition of a TimeServer using the operation-centric approach.
module TimeServer { interface Time { exception InvalidData { }; unsigned short get_hours(); void set_hours( in unsigned short hours) raises(InvalidData); unsigned short get_minutes(); void set_minutes( in unsigned short minutes) raises(InvalidData); unsigned short get_seconds(); void set_seconds( in unsigned short seconds) raises(InvalidData); }; }; In that case, clients have to call all three get operations in order to obtain the time value. Here, this approach could be favorable. module TimeServer { interface Time { exception InvalidData { }; unsigned short get_hours(); ..... void set_seconds( in unsigned short seconds) raises(InvalidData); void get_time( out unsigned short hours, out unsigned short minutes, out unsigned short seconds); void set_time( in unsigned short hours, in unsigned short minutes, in unsigned short seconds) raises(InvalidData); }; }; Instead of retrieving the value of each single attribute, we can now access the values of all attributes with the help of one operation call (get_time). In most cases, this will reduce network load. A second alternative would be the aggregation of the attributes in a data structure. module TimeServer { interface TimeManager { exception InvalidData { }; struct Time { unsigned short hours; unsigned short minutes; unsigned short seconds; };
void set_time(in Time t) raises(InvalidData); Time get_time(); }; }; Since remote CORBA calls do not differ syntactically from conventional local calls, developers always have to bear in mind which calls are local and remote, respectively, and then optimize their performance. Developers with little experience in distributed applications tend to port their (non-distributed) applications one-to-one without realizing the potential problems of local vs. remote operation calls. Here, the transformation of several get operations into a single get with several out parameters or the definition of a struct may be useful. If set and get operations are often called successively, it should be considered to call an operation with one or more inout parameters instead. A further alternative for the TimeServer example is the definition of a value type as shown in the next section. 2.4.
Object vs. Value Types
Since version 2.3, CORBA supports two types of objects: • •
CORBA objects and value types.
CORBA objects are the conventional objects that are – as before – defined with the help of an IDL interface. Whenever CORBA objects are passed as parameters to an operation call or returned as an operation’s result, the semantics are object-by-reference, i.e., the client passes or obtains a reference that enables it to access the remote object. In comparison to local calls, these remote invocations are expensive and they cannot be employed efficiently to all types of well-known data structures and algorithms. E.g., traversing a tree or modifying a list through remote calls would be slow compared to local operations. To solve problems of that kind, the OMG introduced the concept of value types. Value types are objects that are passed with object-byvalue semantics during an operation call. A value type can contain all the elements that a CORBA object (defined by an IDL interface) can. In addition, it can contain state members that define the value type’s state. The operations specified in a valuetype definition are local operations. Thus, the complete state of a value type can be passed and processed locally. In contrast to CORBA objects, value types also provide a null state. Before the value type specification was added to the CORBA standard, the only alternative to improve performance of applications of the type outlined above
was to encapsulate and pass the different attributes of the interface in a single struct parameter. In CORBA, constructed types, such as structs or enums, are passed by value – their members are copied. In this way, performance can be improved but no methods to access the members of such a struct are available which would violate the information hiding principle. A modification of our TimeServer example that relies on value types would look as follows. module TimeServer { valuetype Time { private unsigned short hours; private unsigned short minutes; private unsigned short seconds; unsigned short get_hours(); unsigned short get_minutes(); unsigned short get_seconds(); factory init( in unsigned short hours, in unsigned short minutes, in unsigned short seconds ); }; interface TimeManager { void set_time(in Time t); Time get_time(); }; }; The operations get_hours, get_minutes, and get_seconds are local and it is not necessary to optimize those accesses. However, get_time and set_time are remote operations that pass the entire state of their result or parameter over the network and instantiate it on the remote host. The way to initialize them automatically after the transfer to the server is by using the IDL keyword factory. The usage of value types brings about the requirement to implement corresponding value type factories. If ORBs with different language bindings are employed in the same application, these factories must be implemented repeatedly in the respective programming language. We have already pointed out that value types and CORBA objects differ in that the operations of a value type can only be executed locally. The reason is that management of and access to value types and CORBA objects are strictly separated. The Portable Object Adaptor (POA) manages CORBA objects and uses an Interoperable Object Reference (IOR) to identify them unambiguously. Value type objects, on the other hand, are not managed by the POA. They exist only in a local context. Thus, their operations cannot be called from a remote host.
Value types may be used for caching of accesses to remote CORBA objects in order to reduce network load and access times. In that case, locally performed state changes of one value type instance require corresponding updates of all other cached value type objects so that all parts of the distributed application can continue processing with a valid copy. This might neccessitate the implementation of complex coordination and synchronization mechanisms. Different problem domains where value types and CORBA objects are useful for caching are described in [1]. Results of performance measurements can also be found there. One further important design decision is related to the IDL type any. An any offers the possibility to pass arbitrary data types. The price for that flexibility is a certain loss of performance due to the marshalling and unmarshalling that takes up more time in the case of an any than for primitive types. The last aspect that can be relevant during design of an IDL interface is related to the standard conformity of the available products.
3.
PORTABILITY ASPECTS
In [2] we analyzed standard conformity of existing IDL compilers with the result that existing products vary in their support of the CORBA standard. In some cases, even CORBA 2.0 constructs like, e.g., constant definitions or definitions of nested structs are supported with differing quality. Changes or extensions introduced subsequently, e.g., related to value types or the IDL keyword native are not always implemented correctly. And some potential problems are caused by underspecifications in the standard as, e.g., in the IDL-to-Java mapping of a long double in CORBA 2.3. In order to circumvent or minimize portability problems in CORBAbased application development one should first decide on the CORBA version that is taken as a basis. In the next step the implementation language should be selected from the available IDL mappings. Only after deciding on these criteria, developers can choose the ORB or ORBs to be utilized. Now the IDL interface can be defined and tested. In case of problems with the IDL compiler, a modified interface definition may be tested. In practice it will rarely be possible to follow the outlined procedure strictly because an ORB was already acquired and must be used in future projects. In that case all parties involved have to test whether the IDL interface is translated by their product. Should problems arise, the necessary modifications of the interface have to be performed by all parties.
4.
CRITICAL REVIEW
Our study has shown that the IDL often offers various alternatives in defining logically equivalent interface constructs. Developers have to analyze these alternatives in regard of flexibility, performance, and portability. An application oriented to performance of its architecture should define an additional operation with several out parameters instead of using the accessor functions of its attributes. Alternatively, a struct or a valuetype could be defined and passed to or from an operation. Also, it may be useful to replace several operations by one operation which declares inout parameters. Other performance improvements can be achieved by defining arrays instead of sequences and by using basic types instead of type any. It should always be clear whether accesses to IDL data types are local or remote. Especially when objects are employed it is advisable to consider suitability of value types or CORBA objects carefully. These considerations are only possible after thorough study of IDL and ongoing development of one’s knowledge since the standard is permanently modified. Another hurdle to be taken is the standard conformity of available ORBs. Developers have to work continuously with different products in order to be aware of strengths and weaknesses of current versions, this is only possible with considerable sacrifice of time.
5.
REFERENCES
[1] Aleksy, M., Kuhlins, S., Using Value Types to Improve Access to CORBA Objects, Proceedings of the 2nd International Conference on Software Engineering, Artifical Intelligence, Networking & Parallel/Distributed Computing (SNPD 2001), ACIS, pp. 841-847, 2001. [2] Aleksy, M., Schader, M., Standardkonformität von IDL-Compilern und deren Einfluß auf die Interoperabilität, OBJEKTspektrum 1, 2001. [3] OMG, CORBA/IIOP 2.5 Specification, OMG Technical Document Number 01-09-01, http://www.omg. org/technology/documents/formal/corba_iiop.htm.
[4] OMG, Time Service Specification; OMG Technical Document Number 00-06-27, ftp://ftp.omg.org/pub/docs/ formal/00-06-27.pdf, 2000.