Genja-A New Proposal for Parameterised Types in Java

3 downloads 661 Views 31KB Size Report
Java. (A third proposal [Thorup 97] is not. discussed here since it assumes dynamic type. checks and we are ... important aspects of software development. The. consistent use ..... about the employees working for a company. We want the type ...
Genja - A New Proposal for Parameterised Types in Java Mark Evered, James Leslie Keedy, Gisela Menger, Axel Schmolitzky University of Ulm Abteilung Rechnerstrukturen Universität Ulm 89069 Ulm Germany [email protected] Abstract: Recent proposals for adding parameterised types to Java have left a number of important practical issues undiscussed. In this paper we present the language Genja which is a new generic extension of Java oriented towards practical support for generic collection types. We discuss design alternatives related to extending the power of unconstrained genericity, solving the weaknesses of constrained genericity in other proposals and defining the compatibility of named and anonymous instantiations. By enhancing support for reusability and providing a higher level style of programming via a library of standard generic collection types Genja aims to extend Java's contribution to efficient software production.

1 Introduction It is well recognised that the addition of parameterised types to Java [Gosling 96] could significantly enhance its support for software reuse. Current Java programming practice (as seen for example in the library classes 'Vector' and 'HashTable') is to make use of the root class 'Object' and run-time type checks as an alternative to parameterised types but this involves a run-time overhead and discovers errors later than desirable. Two proposals have recently been made for parameterised types in Java. (A third proposal [Thorup 97] is not discussed here since it assumes dynamic type checks and we are concerned with static type safety.)

The proposal by the MIT group [Myers 97] is based on the mechanism of the language Theta [Liskov 95]. The second proposal is contained in the language Pizza [Odersky 97] which is an extension of Java including not only parameterised types but also first-class functions and algebraic types. We feel that a number of alternatives and important practical issues have remained undiscussed in all of these proposals. These include: •

• • • • • •

the main uses for parameterised types and good practical support for these uses a discussion of the possible parameters of types primitive types as type parameters conformity with the general style of Java possible alternatives for the instantiation and naming of parameterised types compatibility of instantiated parameterised types appropriate extensions to the Java standard packages

In particular, the first point has a major influence on decisions concerning the others. Examples of parameterised types are almost invariably collection types of some kind such as sets, lists and stacks with the type of the elements being the parameter to the parameterised type. It seems clear that abstract or concrete collection types are by far the most common use for type parameterisation. This can be seen, for example, in the standard Eiffel class library [Meyer 94] and in the Standard Template Library [Musser 96] for C++. Collections of data are one of the most important aspects of software development. The consistent use of standard generic collection types can greatly reduce development costs and

improve program clarity and can lead to a higher-level style of programming (as was the goal in so-called 'very high level' languages such as SETL[Schwartz 86].) The fact that parameterised types are used mainly for defining collection types was the main influence in the design of the generic Java extension Genja. The most common use of a mechanism in a programming language should be supported as well as possible. This paper presents and discusses some of the main design decisions in the development of Genja. The following section discusses the basic questions of what should be made parameterisable in a generic Java and what the parameters should be. The third section presents the main difference to other suggestions - the realisation of generic method parameters in Genja. The fourth section discusses alternatives for instantiation and compatibility of parameterised types and the final section addresses implementation issues.

2 Generic Definitions and Generic Parameters 2.1 Generic units The most fundamental decision in adding parameterised types to Java is already decided by the name 'parameterised types', namely the question of what constructs of the language should be made parameterisable. Types are not the only possibility. Ada,, for example, allows both packages and subprograms to be defined generically. Modula-3 [Cardelli 92] makes modules and module interfaces generic rather than individual type definitions. In accordance with our principle aim of supporting abstract collection types such as 'List' and concrete collection types such as 'LinkedList', Genja offers generic interface types and generic class types as do both Pizza and the MIT proposal. Generic methods, as well as being unnecessary for our primary aim, seem inappropriate in an object-oriented language where objects have first-class status but individual methods do not. Generic packages (corresponding to the generic modules of Modula-3) are conceivable but also seem superfluous if types can be generic. Packages can then remain simply units for grouping and visibility control.

2.2 Generic parameters Both Pizza and the MIT proposal allow primitive types and reference types as

parameters to types. Other languages, such as a recent proposal for 'light-weight' parameterised types for Oberon [Roe 97], allow only reference types since implementation is then easier and unconstrained genericity is simplified (see section 2.3). Ada and C++ allow both types and constants as type parameters. This is useful if a language allows constants but not variables for some purposes, for example in the definition of subrange types or for array boundaries. Since a list of integers can be just as useful as a list of anything else, Genja allows primitive types as well as reference types as generic parameters. And since Java has no user-defined subrange types and arrays can be dynamically allocated there seems to be no need for constants as generic parameters. In these decisions we conform again to the decisions in the two existing proposals but, as presented in section 3, we additionally propose a new kind of generic parameter especially useful for general collection types.

2.3 Unconstrained genericity What can be done with entities (parameters or variables) within a generic class which have the type of a type parameter? In unconstrained genericity, the actual type parameter can potentially be any Java (or Genja) type. This implies that only operations which apply to all types can be allowed on these generic entities. This means, in fact, that only assignment via '=' (and equivalently parameter passing) and comparison for equality with '==' and '!=' can be allowed. This is rather a shame. There are some operations, such as hashing a key value, which would be useful within collection implementations but which do not have a common syntax for all types in Java. For a 'String', for example, we would need the call: k.HashCode();

which is, of course, not valid for an 'int'. There are a number of possible approaches to solving this problem. (The question is not addressed at all in the Pizza paper.) 1. Treat primitive types as objects This is the approach suggested in the MIT proposal. Methods are assumed to exist for primitive types as well as for reference types (at least for this purpose) and to be callable with the same syntax. No particular methods are suggested in the proposal. It is merely stressed that common naming conventions are important. An example of this approach might look like this:

public generic class HashSearchBag[ELEM, KEYTYPE] { ... public ELEM find(KEYTYPE k) { int h=k.HashCode(); ... } }

There are three arguments against this approach: firstly, it seems contrary to the Java style to allow method calls to primitive types as if they were objects. Secondly, (in the MIT version) the operation to be used within the generic class must be listed in a 'where'-clause. Thirdly, and most importantly, whatever the methods may be which are defined for primitive types, they are fixed. It is not possible for a programmer to introduce a new operation which is applicable to any type. 2. Provide a 'generic switch' construct The syntax for equivalent operations on different types need not be identical if we can write different code for various possible actual parameter types within the generic class. For example: public generic class HashSearchBag[ELEM, KEYTYPE] { ... public ELEM find(KEYTYPE k) { int h; genswitch (KEYTYPE) { Object: h=k.HashCode(); int: h=k; ... }; ... } }

The switch would be required to list at least the cases 'Object', 'boolean' and 'double' so that all types are handled The appropriate code would be selected by the compiler for a particular instantiation. This approach has the advantage of being very flexible but the disadvantages of introducing a new language construct, requiring more programming effort and being rather cumbersome. 3. Make use of method overloading There is in fact already a mechanism in Java which enables us to solve this problem: the overloading of methods based on their signatures. In Genja we define that generic entities can be passed as parameters in method calls if the method is overloaded to allow at least the parameter types 'Object', 'boolean' and 'double' and if the result type of the method is the same in each case. The standard Genja class 'Any' (see Appendix B) provides a number of static methods of this kind, including 'hashCode'. We can then write the example as: public generic class HashSearchBag[ELEM, KEYTYPE] { ... public ELEM find(KEYTYPE k) { int h=Any.HashCode(k); ... } }

This approach has the flexibility of the generic switch while requiring no new language construct.

2.4 Constrained genericity In unconstrained genericity any type may be provided as an actual parameter and therefore nothing can be assumed about the methods of the type except what is valid for all types. In many cases these operations are not sufficient. In an 'OrderedList' class, for example, an operation 'precedes' is required to determine which of two elements should precede the other. Such an operation is not present in all types. Constrained genericity restricts the allowed actual type parameters to types having the required methods. The traditional way of providing constrained genericity is via the inheritance mechanism. In Eiffel, for example, a formal generic type parameter can be declared in such a way that only subtypes of some specified type are allowed as actual parameters. A number of problems have been identified in this approach [Cook 89] and a considerable amount of research has gone into finding alternatives. The MIT proposal adopts the 'where'-clause mechanism of Theta. Pizza uses F-bounded polymorphism [Canning 89] for constrained genericity. As discussed in [Evered 97], both these and other mechanisms for constrained genericity such as generalisation [Pedersen 89], structural subtyping [Abadi 96] and matching [Bruce 97] have severe problems with regard to generic collection types. Firstly, a potential element type of 'OrderedList' must contain the method 'precedes' with exactly that name. This may not be the case if the type was not explicitly designed with 'OrderedList' in mind. More importantly, it is not possible to have two ordered lists with the same element type but different ordering criteria. In the following section we present an alternative to constrained genericity which is more flexible and more useful for generic collection classes.

3 Generic Method Parameters 3.1 Instantiation-specific operations The problem with constrained genericity is that the operations required by a generic class are assumed to be methods of the types provided as

parameters and this is often not what we really need. In many cases, including 'OrderedList', we would prefer the operations to be a property of a particular collection type rather than a property of the element type. In Genja the problem is solved by allowing the required operations to be provided directly as generic parameters at the time of generic instantiation. That is, in addition to types as generic parameters we also allow generic method parameters. The definition for 'OrderedList' then looks like this: class OrderedList[ELEM, boolean PRECEDES(ELEM e1, ELEM e2)] { ... }

As can be seen, the class 'OrderedList' is passed an operation which takes two entities of the type 'ELEM' and returns a boolean value which is 'true' if 'e1' should precede 'e2' in the list. The actual parameter for the generic method parameter 'PRECEDES' must be an expression. The expression can refer to the parameters 'e1' and 'e2' and must produce a boolean value. For example: class Student { public int studNr; public String name; ... } class SomeClass { public void someMethod () { OrderedList[Student, e1.studNr