Simple Support for Design by Contract in C++ - CiteSeerX

5 downloads 9285 Views 59KB Size Report
One programming language has native support for design by contract [7, 10]. ...... Technical Report TRCS98-31, Department of Computer Science, University of ...
Simple Support for Design by Contract in C++ Pedro Guerreiro Departamento de Informática Faculdade de Ciências e Tecnologia Universidade Nova de Lisboa P-2825-114 Caparica, Portugal [email protected], http://ctp.di.fct.unl.pt/~pg/

Copyright notice This paper will be published in Qiaoyun Li, Richard Riehle, Gilda Pour and Bertrand Meyer (editors), Proceedings TOOLS 39, IEEE 2001. The material is © 2001 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.

Simple Support for Design by Contract in C++ Pedro Guerreiro Departamento de Informática Faculdade de Ciências e Tecnologia Universidade Nova de Lisboa P-2825-114 Caparica, Portugal [email protected], http://ctp.di.fct.unl.pt/~pg/ Abstract Design by contract can be seen as an advanced software engineering technique for building quality software in a professional environment or as a fundamental programming concept, useful even for elementary programming. If design by contract is an afterthought, sophisticated tool support, with macros, preprocessors or patterns is acceptable. If it is to be used from the very first programs, it must not be yet another difficult obstacle to the novice programmer. This point of view seems to recommend Eiffel as the sole vehicle for the early introduction of design by contract. However, compromises are possible, if your organization mandates C++, for example. For design by contract in C++ we use a class template, Assertions, which is inherited by the classes we are specifying. This class handles preconditions, postconditions and class invariants, and supports the “old” notation. The assertions themselves are not difficult to implement, but the “old” notation, which is necessary in order to compare the value of an attribute in a postcondition with its value at an earlier stage in the function, raises interesting issues. In most common situations, using the assertions is straightforward. There are, however, more rare cases involving inheritance and recursion that must be handled with a discipline.

1. Introduction One programming language has native support for design by contract [7, 10]. That’s Eiffel. If we want to write software with the design by contract approach, but for some reason we cannot afford the original tool, then we must somehow mimic the preconditions, postconditions and invariants as provided by Eiffel in the language imposed upon us. Actually, that has already been done for many languages. One example is the iContract system for Java, which uses formal comments that can be handled by a preprocessor, and generates executable instructions to be used for catching bugs while the program is being developed [6]. The JMSAssert system uses the same technique [9]. Another example, also for Java, is the jContractor library [5]. With jContractor, we write the preconditions and postconditions for each method as separate functions with special names, and then let the system do the instrumentation of the methods automatically, using Java reflection. For C++, we have the Nana library, which defines a set of macros working together with a given debugger [8]. And there is our own previous work, in which assertions are functions from a class Assertions, which is inherited by the classes we want to develop using design by contract [3]. We are aware of similar efforts to introduce design by contract in other languages, namely, Perl, Python, Common Lisp, and Smalltalk [2, 12, 4, 1]. Probably the only reasonable support for design by contract is one that is thought out from the beginning, and which grows “naturally” from the specification language. The add-on techniques are laudable, but they are a compromise, sometimes an ugly compromise.

1

Some of the systems we mentioned are industrial strength systems, longing to be useful for developing production software. Our goal is more modest. We want a system capable of quickly introducing design by contract to a C++ audience, with no need to install libraries, or understanding complex architectures, involving new preprocessors or intricate debuggers. Typically, these audiences are composed of C++ programmers who learned the language on their own, starting from C, and who are now trying to catch up with object technology. Basically, we need the full set of Eiffel assertions, and these assertions should be executable, raising exceptions when they fail. We must be able to turn off some or all of the assertions, when doing a release build. We cannot forsake the “old” notation, which enables us to compare in a modifier function the final value of an object or of an attribute with its initial value. And we want our solution to be object-oriented, in the sense that we feel free to use multiple inheritance, container classes, iterators, template classes, C++’s standard template library (STL), but no low-level gadgets such as pointers moving around between functions, complicated macros or obscure compilation options. After this introduction to the problem, we establish set of recommendations on C++ style, some of which may seem rather unorthodox, but which are necessary to make our system usable. We then present class template Assertions, focusing on preconditions, postconditions and invariants. This class has functions that emulate the “old” notation, which can be used for storing the initial values of integer attributes and of the target object. We illustrate the usage of Assertions with the development of a class for strings, which we first specify and then implement. We then investigate some problems raised by inheritance, which lead us to a refinement of the class. Handling recursive functions, however, forces us to a major modification.

2. C++ Style, unconventional Usually, a C++ class comes in two files: the header file and the definition file. Where should we write the assertions: in the header file, because assertions are specification? Or in the definition file, since they are executable? Or in both, running the risk of inconsistency (and duplicating work)? We recommend, instead, that we forsake the traditional style, and do away with the definition file, using only the header file, as if all functions were defined inline, very much like Java and Eiffel do. This is such a drastic change from the C++ normality that it risks killing the endeavor at the outset. On the other hand, maintaining two files for each class is so awkward, that sooner or later a C++ development environment will come up that hides that from us, allowing us to concentrate on our classes, without having to worry about where they are stored. All data members in a class are private. If they were not, they could be modified without control, and that would ruin design by contract. In fact, as far as design by contract goes, we could tolerate protected data members, because the assertions of a class apply to functions in derived classes as well. However, for most practical purposes the binary private/public model of data hiding serves well, and we never found the absolute need to use protected data members in our programs [13]. We are aware, of course, that many existing class libraries use protected members a lot. Function members, other than the constructor and destructor are either selectors (i.e., const functions returning a value) or modifiers (i.e., non-const void functions). Class arguments in a function are passed by const reference. This means that the only way to change an object is to explicitly call a modifier on it. In response to such a call, the object may change directly its data members of a basic type, or call further modifiers on its data members of a class type. No function will modify an object other than its target object. This is very strict rule, and later in the theory we relax it a little, allowing class arguments that are iterators or containers to be modified, in some circumstances.

2

The rule is not only strict; it is controversial. The easier alternative would be not to have it. If this were the case, we would accept a member function in a class A that modifies its target object and also modifies its argument of class B. But then we are left with no clear reason why this function is in class A with an argument of class B, and not in class B, with an argument of class A. Having a function that does not modify its object but modifies its argument of class would also be possible, but likely a bad design decision, because the function should be a modifier in class B, with a constant argument of class A. It is interesting that while most people would readily agree that selector should only return information about its target, it is more difficult to accept that a modifier should also modify only its target.

3. Assertions Class template Assertions provides the functions required for emulating preconditions, postconditions, class invariant, and the “old” notation, for use in the parameter class T. For brevity, we omit support for arbitrary conditions, loop invariants and variants. (That part hasn’t changed much since [3]). Preconditions and postconditions are simple functions that merely raise an exception when their Boolean argument is false. There is one function for preconditions, named Require, and two for postconditions, named Ensure and Satisfy. Ensure is used with modifiers and expresses a condition on the object that has just been modified, possibly relating it to its original value. Satisfy is used with selectors, and expresses a condition on the result of the function. (We will see different uses of Satisfy later.) This is different from what happens in Eiffel where the distinction is not made. The three functions have a second argument, of type std::string, which will be appended to the message generated by the exception. We acknowledge that the distinction between Ensure and Satisfy is somewhat of a burden, but not an unreasonable one, because it stresses the basic distinction between modifiers and selectors. All assertion functions are made available to the classes we want to specify or check by inheritance. Therefore, calls to Require, Ensure or Satisfy are made on behalf of the current instance. Note that these are virtual functions, not static functions as in some assertion systems for Java [15, 14]. Class template Assertions also declares a default Invariant function, returning true. The idea is that the derived classes will redefine the invariant according to their requirements. The invariant needs to be checked only after construction and after a modifier is called. Thus, the evaluation of the invariant should be made by function Ensure, but it is not necessary for function Satisfy. Functions Require, Ensure, Satisfy and Invariant are const functions, which guarantees that they do not cause side effects on the object. Furthermore, the first three have Boolean arguments, which when computed for class objects cannot produce side effects either. Note that the expression of the Boolean argument cannot syntactically include calls to modifiers, and only modifiers can change the state of an object of a class type. Thus, it is unlikely to cause a side effect with class Assertions by mistake. (Of course, this is C++, and you can always deliberately cause side effects if you really want to.) For supporting the “old” notation we introduce two data members in class Assertion: oldAttributes, of type std::map, for storing the old values of various attributes, and oldObjects, of type std::map for storing clones of the current instance [11]. Each attribute is identified by a static tag, and, in the absence of recursion, it is enough not to repeat the labels in the class. The same applies to objects. When we want to observe the initial value of an attribute in the postcondition of a modifier function, we must explicitly call function Observe, with an identifying tag, in order to store the initial value of that attribute, before it is modified. Later, we can fetch it using a function

3

Attribute,

using the tag as the key. A similar technique is used for storing the initial value of the target object, and then fetching it, but the functions are now called Remember and Old. We considered overloading the function names in order to simplify the usage, but in the end we decided not to, in the benefit of clarity. In fact, functions Observe and Attribute are redundant, because if we remember the object with function Remember, we can have access to the initial values of all attributes. On the other hand, in most cases, modifiers only change the value of an attribute, and it would be excessive to store the full object. In order to be able to switch on or off the preconditions for all objects of the class, we introduce two static Boolean data members: PreconditionsEnabled and PostconditionsEnabled, and functions to set and reset these variables. Finally, class template Assertions provides its own exception type as an internal class. Here it is, programmed using the recommended style, i.e., with all functions defined inside the class declaration: template class Assertions: public Clonable { public: class Exception: public exception { public: Exception(const std::string& label): exception (("Assertion violation: " + label + ".").c_str()){}; }; private: static bool PreconditionsEnabled; static bool PostconditionsEnabled; std::map oldAttributes; std::map oldObjects; public: virtual void Observe(const std::string& tag, int x) { oldAttributes[tag] = x; } virtual void Remember(const std::string& tag) { delete oldObjects[tag]; oldObjects[tag] = dynamic_cast(Clone()); } virtual int Attribute(const std::string& tag) { return oldAttributes[tag]; } virtual const T& Old(const std::string& tag) { return dynamic_cast(*oldObjects[tag]); } virtual bool Invariant() const { return true; } virtual void Require(bool b, const std::string& label) const { if (PreconditionsEnabled && !b) throw Assertions::Exception("Require " + label); } virtual void Ensure(bool b, const std::string& label) const { if (PostconditionsEnabled && !Invariant()) throw Assertions::Exception("Invariant " + label); if (PostconditionsEnabled && !b) throw Assertions::Exception("Ensure " + label); }

4

virtual void Satisfy(bool b, const std::string& label) const { if (PostconditionsEnabled && !b) throw Assertions::Exception("Satisfy " + label); } };

For brevity, we omit the static functions that handle the static members. Class template Assertions derives from an abstract class Clonable, which only provides the interface for the Clone function, a pure virtual function that has to be defined in each non-abstract derived class. Note that the invariant is systematically checked by function Ensure but not by function Satisfy or by function Require. In the general case, invariants must also be checked in preconditions. However, our convention that all class arguments are passed by const reference avoids the indirect invariant effect [10], through which an operation on an object may invalidate an invariant in another. In our case, the only way to change the value of an object is by calling a modifier on that object. Therefore, invariants need to be checked only upon exit from modifiers and constructors. This does not solve the indirect invariant effect: it merely postpones it. Still, a lot of design by contract can be done within our current framework. When the time will come to allow dynamic aliasing through references in constructors or non-const pointer arguments in functions, we will have to be more careful in specifying postconditions in functions that may modify objects other than their target, and also in functions that modify their target only, but for which the target is involved in invariants of other objects. For example, consider the “round trip” situation described in [10], pages 403-406, in which a class A has a forward link to class B, which has a backward link to class A. The invariant states the if an object x of class A is forwardly linked to an object y of class B, then y is backwardly linked to x. The problem arises when y decides to clear its link to x, or replace it with another. This operation destroys the invariant of x, although x is not involved in it. In this case, we would have to be more careful in specifying the operation: the postcondition should express that the object originally linked to y, if there was one, is now linked to nothing. Quite clearly, the responsibility of maintaining the invariant of x lies now not only on x but also on y. An alternative design would be making the operation available only to class A. Then, when y wants to clear its link to x, it will have to ask x to do so, and in this case x will bear alone the responsibility of maintaining the invariant. This situation of a round trip invariant, in which the invariant for a class A involves objects of class B that may be modified by operations outside class A, occurs typically when class B is a container of objects of class A, and each A object must know its container (or its containers, in case there can be more than one). Class B has modifiers Put(A&) and Prune(A&), and selectors IsFull() and Has(const A&). Class A has modifiers Enter(B&), Leave(B&) and selector IsIn(const B&). Note that functions Put, Prune, Enter and Leave modify their arguments, this going against the “don’t modify arguments” rule. However, writing y.Put(x) is equivalent to writing x.Enter(y), and y.Prune(x) is equivalent to x.Leave(y). Which pair of functions should we keep? For this purpose, we relax the rule to “don’t modify arguments, unless they are containers”. Thus, we keep Enter and Leave, in class A and drop Put and Remove from class B. However, Put and Remove are still necessary to actually insert and delete the element in the container. But now they become private functions with const reference arguments, Put(const A&) and Prune(const A&), available to class A. (In C++, we can use friend functions for this.) As a matter of fact, it seems that these functions should not be available in class B at all, although they are defined there, but this kind of information hiding is not supported by the popular object-oriented programming languages.

5

4. Class StringT As an example, let us develop a class StringT to represent simple strings, with modifiers to add a character at the end (Put) or at a given position (PutAt), to select a substring (Select), to erase a substring (Erase), and selectors for the capacity (Capacity), the length (Count), the character at a given position (At), for checking if it is empty (Empty), and for checking if it is full (Full). We start by specifying the class using preconditions, postconditions and invariants. In order to use the resources of class Assertions, class StringT inherits from Assertions: class StringT: public Assertions { // … };

This is an unusual pattern: we could almost say that class StringT inherits from itself. The invariant for this class expresses relations that hold among various selectors: virtual bool Invariant() const { return Capacity() >= 0 && 0