Fostering Component Evolution with C# Attributes - CiteSeerX

0 downloads 0 Views 102KB Size Report
The paper shows how the C# language allows such depen- dencies to be ... tool – a design assistant – which may guide designers while they evolve and reuse existing ..... suggest to use always the delegation pattern when library classes are ...
Fostering Component Evolution with C# Attributes [Full Paper] Carlo Ghezzi

Mattia Monga

Politecnico di Milano Dipartimento di Elettronica e Informazione Piazza Leonardo da Vinci, 32 I 20133 Milano, Italy

Politecnico di Milano Dipartimento di Elettronica e Informazione Piazza Leonardo da Vinci, 32 I 20133 Milano, Italy

[email protected]

[email protected]

ABSTRACT This paper discusses the problems arising when object oriented libraries are evolved through the subclass mechanism. The overriding of a method may in fact produce undesirable side effects in the behavior of other methods. More generally, the designer of an extension may be unaware of the dependencies among class features, which should be taken into account when a class is evolved. The paper shows how the C# language allows such dependencies to be documented using attributes. Attributes may be retrieved via reflective mechanisms that can be used by a tool – a design assistant – which may guide designers while they evolve and reuse existing class libraries. To facilitate the approach another tool may automatically record dependency attributes for each class. The approach is also shown to help in the case of the socalled semantic fragile base class problem that has been illustrated in the literature.

Categories and Subject Descriptors

the module interface. A careful definition of module interface is the basis for module reuse. In general, one can distinguish between two kind of users (and hence, two kind of interfaces): 1. users whose methods employ the public features of a class by accessing an attribute or calling a method; 2. users which inherit the code of a class by producing a derived class. Such users may be called clients and inheritors, respectively. Clients and inheritors need to customize the classes they use. In order to adapt the used class to their needs they can: • as clients, pass suitable parameters to methods; • as inheritors, override inherited methods with new versions.

D.2 [Software]: Software Engineering

1.

MOTIVATION

Information hiding can be seen as the main founding principle of object oriented systems. Classes are groups of routines that share a secret implementation and their functionality can be used by exploiting only the information revealed through their public interfaces. Programmers are not allowed to write new classes depending on private members of existing classes. Such prohibition is enforced by the type system, both at compile and run time. Thus, classes provide the syntactic boundaries through which only a subset of the internal module resources (its secrets) can be made accessible to external users. Accessible resources are specified by

Designers of programming languages must decide which information can be made visible to clients or inheritors, and which information can be safely hidden. In principle, the information needed by clients could be different from the one needed by inheritors. Instead, most object oriented languages express the two interfaces – hereafter, C-interface and I-interface respectively – with the same linguistic constructs, i.e. member types and signature of methods. Moreover, the C-interface is usually a subset of the I-interface: for example, C++ provides the keyword public to mark a member as available to clients and inheritors and the keyword protected to mark a member as available to inheritors only. The interface establishes a sort of use contract: it states what can be used by clients or inheritors and what they can change by providing actual parameters or by overriding. The semantics of a method can be documented by stating under which conditions (preconditions) the method is guaranteed to produce a well defined effect (postconditions) [8, 2].

IWPSE 2002 Orlando, Florida USA

Although very simple, interfaces specified just by method signatures allow the type system to catch large families of

errors, both for clients and inheritors. Even if the object oriented language provides dynamic binding (i.e. the dynamic type of an object can be a subtype of its static type) most type errors can be caught statically by requiring the overriding of member functions to conform to an invariancy rule: overriding cannot change the type of formal parameters1 . Moreover, in order to guarantee full substitutability for objects of a given class and objects of a subclass, preconditions and postconditions of overriding methods must be changed only in a controlled way, by weakening (or preserving) preconditions and hardening (or preserving) postconditions [7].

ods Add, ForEach, and RemoveIfPresent. This dependency, unfortunately, is not documented in the inheritor interface, although it could be crucial. public class CountedSet: Set { private int cardinality = 0; public override void Add(object o){ (Set)this.Add(o); cardinality += 1; } public override bool RemoveIfPresent(object o) { if (( Set)this.RemoveIfPresent(o)){ cardinality −= 1; return true; } return false; } public int Cardinality { get { return cardinality; } }

Method signatures and pre/post conditions provide enough information to support client reuse. However, as far as inheritors are concerned, this information is not sufficient. Inheritors must evolve base class members with strong discipline, but the constraints we showed on overriding methods is not enough to guarantee system correctness. In particular, it may still happen that the overriding of a method requires other methods to be rewritten as well in order to preserve a correct semantics [3]. public class Set { private ArrayList hiddenRep = new ArrayList(); public delegate void Action(object o); public virtual void Add(object o){ hiddenRep.Add(o); } public virtual bool RemoveIfPresent(object o) { if (! hiddenRep.Contains(o)) return false; hiddenRep.Remove(o); return true; } public virtual void ForEach(Action a){ IEnumerator i = hiddenRep.GetEnumerator(); while(i.MoveNext()) a(i.Current); } public virtual void AddAll(Set s){ Action a = new Action(this.Add); s.ForEach(a); } public virtual void Remove(object o){ bool removed = RemoveIfPresent(o); if (! removed) throw new Exception(”Object not present”); } } Figure 1: A Set class Consider the class Set written in C# shown in Figure 1: the methods AddAll and Remove rely on the other meth1

Actually, this rule could be relaxed. A controvariant change is a change that substitutes a type with a more general one. C++, Java, and C# do not allow a change of this kind. However in other languages (e.g. Sather, Emerald) this is possible: formal parameters of overriding methods can be changed controvariantly without any threat of type safeness. Some other languages (e.g. Eiffel, Ada) claim that controvariancy is unnatural to designers, and allow instead covariant changes (that is, substituting a formal parameter type with a more specific one). Such a rule, however, might cause type errors

} Figure 2: Overriding the Add method propagates the change in the hidden representation Information on dependencies could be exploited by programmers who could propagate a change in the hidden representation of the Set state to the whole class just by overriding only Add, ForEach, and RemoveIfPresent. For example, the class CountedSet (see Figure 2) augments the state of the Set class with a new cardinality member, in order to track the number of objects in the set: the overriding of Add() and RemoveIfPresent() is sufficient to guarantee the correct semantics. public class EvenSet: CountedSet { public override void Add(object o){ if (this.Cardinality % 2 == 0) (CountedSet)this.Add(o); } } Figure 3: Overriding the Add method compromises AddAll Inheritors should know that any modification (overriding) of AddAll or Remove has no consequences on other methods, whereas by overriding method Add, the semantics of AddAll can be changed, introducing subtle errors in the derived code. For example, consider the class in Figure 3: the method Add() was overridden and it now adds objects only if the cardinality of the set is even. A programmer could think that it is still possible to build a set with an even cardinality by using AddAll(). Unfortunately, AddAll() relies on Add(), and it can, at most, add the first element.

2.

LAMPING’S PROPOSAL

In order to cope with this problem Lamping [5] proposed to enrich the I-interface with dependencies among class features. The rationale behind this suggestion is to provide inheritors with enough information about how the features of

a class combine to produce its overall behavior, so that programmers fully understand the consequences of overriding. In particular, inheritors need to know what methods may be changed without undesired effects or how far-reaching these effects will be. In the code shown in Figure 1, if we evolve the Set class, it is crucial to know that methods Add, ForEach, and RemoveIfPresent have a more fundamental status than the other two, which are built upon them. In other words, Add, ForEach, and RemoveIfPresent rely only on the hidden class representation, while AddAll and Remove rely also on a more specialized manipulation of the class state. In general, each method uses a subset of the features of its class, i.e., it could be a method of a superclass, that is a more general class providing only that subset of features. In other words, it relies on a more general type. According to this approach, each method relies on a type that is, in general, a weaker (i.e. less specific) type than the one defined by the class itself2 . For example, the method Add knows nothing about methods AddAll, Remove, ForEach, and RemoveIfPresent: its functionality relies only on a set representation. We can imagine a super-class SetRepresentation providing just the needed features and safely say that the this reference used by Add has that type, i.e. Add can manipulate this only according the protocol codified by SetRepresentation. It is typically possible to organize a hierarchy of types. Figure 4 shows how the Set type can be decomposed in layers: e.g., the SetProtocol relies on the CoreSetProtocol layer; when this layer is changed, all the SetProtocol layer should be overridden. Therefore, class methods can be decorated by a declaration on which layer they rely on and these annotations could be part of the interface reserved for inheritors. When a class is sub-classed these annotations can be exploited to check coherence of overriding; for example, the method AddAll guarantees its contract as long as the protocol (i.e. the type defined by ForEach, and Add) on which it is based on remains unchanged. Figure 5 documents type dependencies in a comment before each method.

abstract class SetRepresentation { // hidden representation }

abstract class CoreSetProtocol: SetRepresentation { abstract void Add(object o); abstract bool RemoveIfPresent(object o); abstract void ForEach(Action a); } abstract class SetProtocol: CoreSetProtocol { abstract AddAll(SetProtocol s); abstract void Remove(object o); }

class Set: SetProtocol { /∗SetRepresentation∗/ public virtual void Add(object o){ /∗...∗/ } /∗CoreSetProtocol∗/ public virtual bool RemoveIfPresent(object o){} /∗CoreSetProtocol∗/ public virtual void ForEach(Action a){ /∗...∗/ } /∗SetRepresentation∗/ public virtual void AddAll(Set s){ /∗...∗/ } /∗SetRepresentation∗/ public virtual void Remove(object o){ /∗...∗/ } } Figure 5: Set annotated with dependencies

Figure 4: Set and other weaker types with their relations 2 Sometimes methods act as pure functions, because they do not make use of the object state: in this case they rely on the most general type, a type without any feature nor constraint

3.

OPPORTUNITIES FOR EXPRESSING DEPENDENCIES IN C#

Dependencies could be introduced as syntactic elements in method declarations. However, as far as we know no real languages implemented this opportunity. It is possible to introduce dependencies in comments, as we did in Figure 5. These comments could then be analyzed by a program. Some tools (for example, [1]) aimed at adapting Java to design by contract [8] exploit this approach. A limitation of this approach dealing with the problem is the need for source code availability in order to analyze comments. However, the usefulness of dependencies arises mainly when source code is not available and programmers want to evolve binary components of which they know only the interface information [13]. In fact, several component frameworks (JavaBeans, COM) owe their success to the ability of deploying binary, third party developed objects, about which users knows only public member signatures.

3.1

C# Attributes

A new opportunity is offered by the .NET [15] runtime platform. Such a platform provides support for attributes, which are annotations associated with syntactic elements of a program: classes, members, method parameters, etc.. In the .NET jargon an assembly is the unit of deployment, containing one or more binary modules. Each assembly contains virtual machine instructions (data) and information related to them (metadata): version numbers, character set used in strings, etc.. The attribute information is stored within the metadata of the assembly. Metadata can be customized by using custom defined attributes and can be easily retrieved at runtime through the reflection services provided by the .NET framework. All .NET aware languages according to the standard are able to manipulate assembly metadata and therefore custom attribute. In C# [14], attributes are objects of a class System.Attribute or a user-defined sub-class that can be associated to assemblies, modules, types, methods, properties, parameters by instancing them at compile time in special sections of the code between brackets. For example, we can define a CodeReviewAttribute containing a comment of a reviewer, and optionally an integer id for classifying purposes. The C# code needed is shown in Figure 6. Notice that the built-in attribute AttributeUsageAttribute is applied to the CodeReviewAttribute class itself, documenting that the attribute will be associated to classes and methods, and it could be applied multiple times to the same entity. The built-in attribute AttributeUsageAttribute appears with a shorter name, because when an attribute class is used, the suffix Attribute can be omitted. Figure 7 creates two instances of a custom attribute CodeReviewAttribute and associates them with the class Complex and the method Complex.RealPart(). Figure 8 shows how it is possible to retrieve the information stored in custom attributes by exploiting reflection facilities. Attributes can be used to record information about dependencies. This information remains within the deployed binary assemblies and can be analyzed by suitable tools, which

[AttributeUsage(AttributeTargets.Class | AttributeTargets.Method, AllowMultiple=true)] class CodeReviewAttribute: System.Attribute{ public CodeReviewAttribute(string comment){ this.comment = comment; } public Comment{ get{ return comment; } } public Id{ get{ return id; } set{ id = value; } } private string comment; private int id = 0; } Figure 6: A CodeReview attribute class

[CodeReview(”Mattia”)] public class Complex { [CodeReview(”Good trick!”, Id=5)] public double RealPart(){ // ... } } Figure 7: Instancing C# attributes

Type t = typeof(Complex); foreach (CodeReviewAttribute a in t.GetCustomAttributes(typeof(CodeReviewAttribute), false)){ // use a.Comment and a.Id } Figure 8: Querying C# attributes

can support developers in building their own components coherently with base components selected from a library.

3.2

Exploiting C# Attributes

As we saw in Section 2, Lamping’s original proposal was to specify a type for the self-object manipulated by each method. In general, this type is a super-type of the one defined by the class the method belongs to. A straightforward implementation of this idea, without modifications, forces programmers to define all the different types needed by dependencies, as we did in Figure 4. This is awkward and causes a pollution of ad-hoc created classes, and is also a source of design problems in single inheritance languages as Java and C#. In order to make practical use of the underlying concept, we suggest a characterization of dependencies based on sets of methods and in the remaining part of this Section we show how this can be accomplished by exploiting C# attributes. First, we define (see Figure 9) a DependencyAttribute containing just a string member Dep. Attribute instances can be applied to public and protected methods and properties. If an attribute Dependency(”Signature”) is associated with a method (or a property) M of the class C, this association documents that M uses C (i.e., it depends on the presence of a coherent implementation of a method Signature in the class C). Since a method can depend on multiple features, multiple instances of DependencyAttribute can be applied to each method or property. When no DependencyAttribute is associated with a method (or a property), then the method (or property) is assumed to depend on all of the features of its class. Two symbolic dependencies are predefined: NONE, meaning that the method is a pure function: this is incompatible with any other dependency; and HIDDEN, meaning that the method relies on the hidden representation of the object. The attributes document only the direct dependencies among methods: thus, if method A relies on the method B, which relies on C, then we document only the dependencies between A and B, and between B and C, respectively. Second, we can use the list of DependencyAttribute associated with a method to check if dependencies are honored by inheritors. A tool acting as a design assistant can in fact use dependencies to provide advice about modifications via subclassing. When we want to override a method M of a class B in derived class D, the assistant might suggest that also all methods of B that have M among their dependencies should also be overridden. Or, more specifically, it might distinguish between two cases::

[AttributeUsage(AttributeTargets.Method |AttributeTargets.Property, AllowMultiple=true)] public class DependencyAttribute: System.Attribute { public enum SpecialDependency {NONE, HIDDEN}; public DependencyAttribute(string dep){ this.dep = dep; } public DependencyAttribute( SpecialDependencies dep){ switch(dep){ case NONE: this.dep = ””;break; case HIDDEN: this.dep = null;break; } } public string Dependency{ get { return(dep); } set { dep = value; } } private string dep; } Figure 9: Definition of the class DependencyAttribute

M.Bpre → M.Dpre ∧ M.Dpost → M.Bpost When the above condition does not hold, all the members of B that have M in the list of their dependencies should be overridden. Methods marked with the SpecialDependency NONE can be overridden without side effects of any kind. Methods having among their dependencies the SpecialDependency HIDDEN should be overridden every time the internal representation is changed by adding private members. The dependency HIDDEN denotes that the method relies on the hidden implementation of the module. However, if new private members were added by a subclass, the dependency HIDDEN would denote a different hidden implementation, thus methods relying on that should be overridden. Methods without any associated DependencyAttribute must be always overridden, assuming that a derived class differs from its base class.

1. D.M and B.M have identical pre and post-conditions; 2. pre and post-conditions of D.M and B.M differ. In the first case is in general safe to leave methods that depend on B.M unchanged (however, in Section 4 we show a counter-example). The second case is normally the critical one. Let Mpre be the preconditions of a method M and Mpost its postconditions. In order to honor the substitution rule [7], so that the overriding method D.M could be substituted safely to the overridden B.M in all its occurrences, the following property must hold:

When a developer builds a new class by inheriting an old one, the dependencies associated with methods and properties of the base class are retrieved by the design assistant via reflection facilities, and they are compared with the set of methods and properties overridden by the derived class. Let B be the base class and D the derived one, then (in pseudo code): foreach (Member M in B.GetMembers()){ if (! ExistOverriding(D.M)){ foreach (Member delta in (B.M).GetCustomAttributes(

typeof(DependencyAttribute))){ if (ExistOverriding(delta)){ SuggestOverriding(D.M); } }

However, library developers could produce a new version of their class, such as the class shown in Figure 12. This class, while it is also a perfect substitute of the class in Figure 10, breaks totally the UserClass, by introducing a mutual recursion.

} } While useful, documenting classes with DependencyAttributes can be boring and error prone for developers. Instead, these can be computed from the source code by a suitable tool. In principle, this tool can be the compiler itself. For each method or property M of the class C, can M be compiled if the class were without the feature F ? If the answer is negative, then M depends on F ; this test should be iterated for all features. This is an approximation: if the code use delegates (C# name for typed function pointers) or methods are called by constructing their name at run-time, data flow analisys and program slicing techniques are needed.

4.

AN APPLICATION TO THE FRAGILE BASE CLASS PROBLEM

In order to show the usefulness of DependencyAttributes, in this Section we apply our approach to an instance of the so called semantic fragile base class problem examined in the literature [9]. The semantic fragile base class problem emerges when a change in a base class breaks the correctness of derived ones. In general, library developers are unaware of extensions developed by the users and attempting to improve the functionality of the library they may produce a seemingly acceptable revision of their classes which may conflict with user extensions. This problem arises often when users derive their own version of library classes (for instance, containers) and its critical influence has even forced library developers to deny inheritance from most of their classes (sealed classes, in the C# jargon) and some authors (see for example [6]) suggest to use always the delegation pattern when library classes are involved. Figure 10 shows a hypothetical library class with two methods with identical semantics. We imagine that an inheritor of the class derives a new class as shown in Figure 11. Class UserClass produces objects that are perfect substitutes for the instances of LibraryClass: in fact, the methods of the derived class have the same behavior of the ones of the base class. public class LibraryClass{ private x = 0; public virtual void Method1(){ x++; }; public virtual void Method2(){ x++; }; } Figure 10: A library class public class UserClass: LibraryClass{ public override void Method1(){ Method2(); }; } Figure 11: A class derived by an inheritor

public class LibraryClass{ private x = 0; public virtual void Method1(){ x++; }; public virtual void Method2(){ Method1(); }; } Figure 12: A library class If library classes were decorated with DependencyAttributes as shown in Figure 13 the problem would caught when the modified subclass is re-examined by the tool along with the new version of the library class. As the reader can check by examining the Figure 14, the new version of LibraryClass would have different dependencies and the analysis of the UserClass would have suggested inheritors who overrode Method1() to override Method2() also, because it relies on Method1(). This example, while artificial, is representative of a large class of real problems that can be solved by exploiting our approach. public class LibraryClass{ private x = 0; [DependencyAttribute(DependencyAttribute.HIDDEN)] public virtual void Method1(){ x++; }; [DependencyAttribute(DependencyAttribute.HIDDEN)] public virtual void Method2(){ x++; }; } Figure 13: A library class public class LibraryClass{ private x = 0; [DependencyAttribute(DependencyAttribute.HIDDEN)] public virtual void Method1(){ x++; }; [DependencyAttribute(”void Method1()”)] public virtual void Method2(){ Method1(); }; } Figure 14: A library class

5.

RELATED WORKS

The work of Lamping [5] forms the conceptual basis for the approach described in this paper. The importance of internal dependencies and the need to document them was stressed by other works of Lamping et al. [4, 3], where they advocated the use of a specialization interface, different from the use one. Specialization interfaces were augmented by Stata et al. [11] by incorporating full behavioral specifications. A detailed study on reuse operators and their interactions appeared in [12]. They proposed to associate with any class a reuse contract, composed by method signatures annotated with dependencies information. Reuse contracts record the protocol between developers and inheritors of a reusable component and they can be amended by using a number of predefined operators (extension, refinement, and concretization). In their paper, Steyaert et al. explain how combining these operations can cause conflicts.

Ruby et al. [10] described a documentation technique based on special pre-formatted comments for the Java language. These comments can be processed by a suitable tool to produce a specifications component containing use relationships among methods and attributes of a class. Such components can be distributed with binary functional ones. Thus, functional components can be evolved safely, without analyzing superclass code, by detecting when the contract described in the specialization component is broken. A tool can be used to either prevent breaking the contract, or to inform the user of what classes have changed in an incompatible way.

6.

CONCLUSIONS

Although recently important results have been achieved in object oriented software engineering, reusability of binary components still fails to fulfill developers expectations. One important hurdle to this goal is the lack of adequate documentation for accompanying components: a good component should be used and enhanced with reference to only its specification. Clients and inheritors need to customize the components they use. Method signatures and pre/post conditions provide enough information to support client reuse. However, as far as inheritors are concerned, this information lacks the representation of the dependencies among features. The need of dependencies arises in order to provide inheritors with enough information about how the features of a class combine to produce its overall behavior, so that programmers fully understand the consequences of overriding. In particular, inheritors need to know what methods may be changed without undesired effects or how far-reaching these effects will be. This paper shows how the .NET framework allows such dependencies to be documented using components metadata, making this information practical to be analyzed. In fact, metadata may be retrieved from binary components via reflective mechanisms, and a design assistant may guide designers while they evolve and reuse existing class libraries, reducing the risk of conflicts and clashes.

7.

REFERENCES

[1] Jass. http: //semantik.informatik.uni-oldenburg.de/~jass, 1999. Jass is copyrighted by the Semantics Group, theoretical informatics, department of informatics at Carl von Ossietzky University of Oldenburg, Germany. [2] C. Ghezzi and M. Jazayeri. Programming language concepts. John Wiley & Sons, third edition, 1997. [3] G. Kiczales and J. Lamping. Issues in the design and specification of class libraries. ACM SIGPLAN Notices, pages 435–451, 1992. OOPSLA’92. [4] G. Kiczales, J. Lamping, C. V. Lopes, A. Mendhekar, and G. Murphy. Open implementation design guidelines. In Proceedings of the 19th International Conference on Software Engineering, Boston, MA, May 1997. [5] J. Lamping. Typing the specialization interface. ACM SIGPLAN Notices, pages 201–214, 1993. OOPSLA’93. [6] K. J. Lieberherr. Adaptive Object-Oriented Software: The Demeter Method with Propagation Patterns. PWS Publishing Company, 1996.

[7] B. Liskov and J. M. Wing. A behavioral notion of subtyping. ACM Transaction on Programming Languages and Systems, 16(6):1811–1841, Nov. 1994. [8] B. Meyer. Object-oriented Software Construction. Prentice Hall, 1988. [9] L. Mikhajlov and E. Sekerinski. A study of the fragile base class problem. In E. Jul, editor, ECOOP’98, volume LNCS 1445, pages 355–382. Springer-Verlag Berlin Heidelberg, 1998. [10] C. Ruby and G. Leavens. Safely creating correct subclasses without seeing superclass code. In OOPSLA 2000, Minneapolis, Minnesota, Oct. 2000. ACM. [11] R. Stata and J. V. Guttag. Modular reasoning in the presence of subclassing. ACM SIGPLAN Notices, 30(10):200–214, Oct. 1995. Proceedings of OOPSLA ’95 Tenth Annual Conference on Object-Oriented Programming Systems, Languages, and Applications. [12] P. Steyaert, C. Lucas, K. Mens, and T. D’Hondt. Reuse contracts: Managing the evolution of reusable assets. In ACM SIGPLAN Notices, volume 31, pages 268–285. ACM, Oct. 1996. Proceedings of Conference on Object-Oriented Programming, Systems, Languages and Applications. [13] C. Szyperski. Component Software — Beyond Object-Oriented Programming. Addison Wesley Longman Limited, 1998. [14] E. TC39/TG2. Draft c# language specification. Technical report, ECMA, Mar. 2001. [15] E. TC39/TG3. Common language infrastructure. Technical report, ECMA, Oct. 2001.

Suggest Documents