Combining UML and Z in a Software Process Ebba Thora Hvannberg University of Iceland, Reykjavik, Iceland,
[email protected]
Abstract: Observing students writing specification in UML and Z has motivated the work presented in this paper. There is a need to add formal specification to diagrammatic notations such as UML and there is even a greater requirement to examine where in the software life cycle Z specifications are most useful. The Z language allows us to use abstraction in software development but UML has rich constructs for modelling systems with concise diagrams. Languages are only islands and need to be accompanied with methods that show their usage in a process. The Unified Process has been developed to use with UML. This paper examines how UML can be augmented with Z and where in the Unified Process Z specifications fit best. Keywords : software process, UML, Z, formal specification, diagrammatic notation
1
Introduction
When giving a course on formal methods in software development I have twice tried to compare the benefits and drawbacks of UML (Unified Modeling Language) [1] and Z formal specification language [2]. UML is a language that describes static models in class diagrams that includes attributes of classes, methods and relationships between classes. UML constitutes a set of diagrams for behaviour of software systems as well as components and packages. The Z language is based on propositional and predicate calculus. It can model problems or systems as sets, relations and functions. Schemas express states with data invariants or they express transformations with pre and post conditions. The constraints denote what needs to be true before the transformation on the state takes place and what is true after the transformation has been realised. The comparison has been realised by giving separate assignments in UML and Z. The first time I wanted to contrast the two methods I asked students to do a problem with one methodology and then again using the other. Students were divided into two groups A and B. Students in group A started to solve the problem with UML, following with Z, and students in group B started with Z and followed with UML. Students in group B
didn’t get to the second part since so much time was spent on writing the Z specifications. Students in group A found it time consuming to do the assignment again in Z once they had already done it in UML. The developers couldn’t reuse directly much of the specification that they had already carried out. The second time the course was offered, I had another arrangement. There were several different problems to solve and any student was to do one exercise in UML and another in Z. The problems were real, some suggested by the students themselves. The intentions were to compare the number and types of errors the students made in the assignments. This proved difficult because it was hard to determine what constituted an error in a consistent manner. Comparing errors in UML and Z would not have been meaningful. The characteristics of each of the methodologies and the approaches the students used were observed. Few of the observations will be mentioned below. The learning curve for formal specification is steeper than for diagrammatic approaches. It was observed that if students do not have a lot of experience in modelling software they have problems with both UML and Z. Many of the students work in industry and they objected to model the problem in Z. This was not always rooted in the Z notation itself but modelling in general. If developers are experienced, they can use either UML or Z but Z will probably take longer. The reason is that there are more rules to follow in the Z language and because the semantic richness of the languages forces you to specify and discover more of the problem. The tools for UML are more accessible than Z. Students find it more permissible to break the rules when using UML. It seems that since the language is less formal than Z they tend to think it is all right to break the rules of the grammar. Many of them choose to use a regular drawing tool for UML so that they can be more relaxed with notation and prefer only the UML language specific tools if they intend to generate implementation automatically from design. Students seem to have difficulties in forming states or schemas in Z since they did not have any methodology to determine what constituted a schema. In object-oriented modelling they have some rules to follow, such as making a list of candidate classes from nouns.
Having used and taught both UML and Z has motivated us to combine the two languages in a software process lifecycle. The first approach to the problem was to try to see where in the life cycle developers should use Z and where they should use UML. We have since come to realise that there must be a stronger connection between the two languages. Specifications or declarations in one language should be reusable in the other. The next section specifies the requirements for the combination of UML and Z and the third section describes the selected approach. The strength of Z and the weakness of UML are described in sections four and five respectively. Section 6 discusses Z augmentations for each of the UML constructs. The seventh section describes processes for the combined languages.
2
Requirements
A specification and design method calls for several characteristics. The foremost characteristic is abstraction, that is the ability to describe some thing in the real world without specifying details. An example would be an invoice. We need to be able to say that there are objects of type invoice without saying what characteristics it has or behaviour. In later stages, we need to be able to refine the abstraction in small manageable steps. A specification needs to be verifiable and traceable. We need to verify that a design meets a specification that fulfils requirements and need to do so in a traceable manner. The verification can only be done effectively, if the specification is unambiguous. Another reason for specifying software formally is that the specification becomes shorter, the reader is quicker to see the definitions and the he has to spend less time asking the same questions. The designer needs to feel that something is gained by specifying software formally and therefore he is happy if the design maps easily to an implementation and preferably automatically. Developers should be able to master the method quickly and be able write specifications and designs. Many think that at least buyers and even users should be able to easily read a specification. Finally, the method needs to have a formal meta-model so that one can specify its semantics.
3
A selected approach
During software specification and design, it is an advantage to follow a process that guides us through the methodology. We can have languages such as UML and Z but they are no good unless we have a process. The reason we used Rumbaugh’s book [3] as reference in a course on software design some time after we had
stopped using OMT (Object Modelling Technique) and transferred to UML is that it contained best practices and a process for applying OMT. One of the process frameworks used for UML is The Unified Software Development Process (UP) [4]. The Cleanroom method also has a process framework as described in [5]. The lack of process has undoubtedly been a weakness for Z, since textbooks often cover the language with examples and case studies but not a process that dictates where in the software development life cycle it should be used and how it is connected to other artefacts or sub processes. Most books describe though how you go from abstract specifications to detailed refinements. The OMT method that is the predecessor of UML language assumed that we went directly into analysis with diagrams and didn’t write any text. Functions were a taboo since the emphasis was on objects, their attributes and actions. Later with UML it was permissible to write use cases that are a kind of tasks or functions. Use cases serve the valuable purpose of describing the dialogue between the system and the user. Use cases can be refined into application use cases, where artefacts and actions in the dialogue have been refined into GUI objects and actions. Although not a part of UML, there is a convention of describing use cases with text with pre- and post conditions. Z can easily play a larger role early on in the software life cycle by defining predicates for use cases, but they have to be very abstract. Furthermore, it is important to be able to use schemas as classes in the UML notation. Traceability between the Z notation and UML is also imperative. OMG has specified a constraint language, Object Constraint Language (OCL) [6] that allows you to specify pre and post conditions as well as invariants. These are specified in brackets and can be placed anywhere in the diagrams. The language is disputable [7] and researchers are identifying its shortcomings [8]. One of the objectives of OCL is to allow specification of constraints but in a less formal language than Z. In the past when hurdles have been encountered in software development, various approaches have been taken. The most popular method is to make a tool. Unfortunately, this may be expensive if the user base is not large. A second method is to have a library of functions or components that we can search for solutions. Finally, a third method is to have patterns that we can populate or frameworks. Patterns for design solutions have been presented in UML and Z bases its language on a library of functions. Tools that implement Z have unfortunately not been that successful. Patterns of software life cycle processes may be helpful where developers can see where they need to augment UML with Z. There could be different patterns of software life
cycle processes for different types of software. Strict requirements to safety and security mandate specification that is more formal.
4
Strength of Z
Behavioural specification, i.e. describing how things change is the strong side of Z but at the same time the weak side of UML. In UML, we can express state changes, collaboration, and workflow, but we cannot describe how objects change in terms of transformations. The strong side of UML is simplicity and the weak side of Z is a lack of graphical notation. The key to success in software development is to postpone the refinement as long as possible. If we are to model a file directory we can simply write it first as a set of FILEs. We can postpone the definition of a function that identifies a file by its name. When teaching students Z specification, the author has found it most useful to ask them to write specification in a natural language first and then introduce simple logical operators, followed by quantifiers and variables for predicates. Z is strong on sets and functions. It is for exa mple very easy to specify sequences and bags but in UML you cannot make this distinction when you specify relationships.
5
Weakness of UML
UML’s weakness is that it lacks in richness in specification. UML does not allow us to specify behaviour well in terms of how activities transform one state to another. For example when adding VAT to total price it is not easy to specify that the total price is the base price multiplied with VAT and the VAT can be either 14,5% or 24,5% depending on the type of goods. Such transformation rules and other constraints often come up in the analysis phase in the discussion with the user or the customer. Second, one cannot specify that one attribute or object be derived from another that is that the values are extracted from another attribute. An example of this is the creation of an invoice from an order of goods and we need to copy the product items from the order to the invoice. In UML, we can specify that an invoice relates to a product and that an order relates to a product, but nothing indicates that the product is the same one in both cases.
6
UML augmented with Z
UML is used throughout the software life cycle. All the UML constructs can be used in requirements, analysis and design and testing. Furthermore, some tools derive implementation from class diagrams. Requirements specification involves specifying the business model but analysis is a more precise specification. In this section, we describe how UML can be annotated with Z. We avoid discussing software life cycle processes in this section except we indicate how it is useful to use iteration in the UML and Z specification within the analysis process. The following section explains the different emphasis of UML or Z in the various life cycle processes. 6.1
Use cases
Use case texts are not a part of UML but templates that describe the texts typically include a name, an identifier, a pre and a post condition, a description and exceptions. In use cases text we propose to use Z for pre and post conditions to capture changes of states and data invariants. In order to do this we may identify some sets and schemas that capture states. Propositions and predicates are used to express pre and post conditions of use cases as well as their descriptions. Use case diagrams give a good overview of the use cases of a system. They also allow us to specify two types of relationships between use cases, that is extends and uses. The use case diagrams can be annotated with a Z pre and post conditions. 6.2
Static modelling
In the first iteration of class diagrams, we identify classes with attributes. No associations are allowed. We can write Z specifications of relations and function signatures. Z schemas that capture states will also start to appear during the first iteration. Classes in UML are often Z schemas and vice versa. In the second iteration of class diagrams, UML associations can be derived semi -automatically from Z relationships and functions. In the third iteration of class diagrams methods are added to class diagrams as they materialis e in the behavioural diagrams. Note that the third iteration is most often written after behavioural modelling has taken place as is discussed in the next subsection. If the Z pre or post conditions have not been specified during the drawing of the diagrams, they are written in this iteration.
6.3
Behavioural modelling
In UML there are two main categories of behavioural diagrams, state and interaction diagrams. State diagrams can be either conventional state diagrams or activity diagrams. Interaction diagrams can be either sequence diagrams or collaboration diagrams. Normally, we decide the actions of the classes after drawing the behavioural diagrams. One approach is to specify the actions with Z specifications and some functions have been identified in the use cases. Below, we discuss how behavioural diagrams can be augmented with Z specifications. An essential prerequisite is that we have specified the Z types (or schemas) on which to work during the first and second iteration of writing class diagrams. An activity diagram allows you to specify that an object be updated by writing an arrow from a state to an object, but not how. The update will be specified with Z. State diagrams allow us to specify what actions transform a state to another. We can name the states but we cannot specify what changes. For example if an order is filled it goes from being placed to being filled, but we cannot specify the side effects, such as that the goods should be withdrawn from supplies. Sequence diagrams are used to specify the interaction between the user and major objects of the system in a series of steps where one starts at the top and proceeds down the page as time passes. The major actions could be annotated with Z specifications of post conditions. The sequence diagrams are often used to show the user a series of actions during analysis. Collaboration diagrams are used to denote how the objects are glued together. Collaboration diagrams show a series of numbered action calls between objects. Although collaboration diagrams can be used during analysis, they are perhaps better suitable for design. Our hypothesis is that it may not be suitable to annotate the actions with Z specifications, but to specify them separately with pre- and post conditions. The key to the integration between UML and Z as described in this section is that we specify the sets, relations and functions along with the class diagrams. Furthermore, we need to enhance as much as possible automatic derivation of one form from the other. As mentioned previously it is vital that the two methods are integrated. However, this can be difficult since their presentation form is very different. One is a drawing and the other is mathematical specification. We know from experience that the application of the method can promote or demote its usage. Beginners usually prefer graphical user interface and like to draw but specialist like more succinct presentation of commands. This is surely one of the reasons, formal methods have
few followers and it is so difficult for people to get started. With XML, (www.w3c.org) we may have a technology that allows us to represent both forms in one document and more easily link those together. We hope that XML will also give us better traceability between items of the two languages. UML has a well defined meta-model and thus an UML definition can be translated into an XML document. Undoubtedly, we will be able to specify Z in an XML document in the future.
7
Modified processes for UML with Z
The characteristics of the Unified Process are that it is use case driven and architecture centric. That it is iterative means that artefacts are delivered in increments. Finally, it is risk driven, that is every step of the way we try to identify the risks and control them. The question that is fore us is how adding formal specification to UML supports the characteristics or if the addition motivates any changes to the process. We will discuss in the following three subsections how Z supports the desired claimed characteristics of the Unified Process. In the fourth subsection, we will examine the different models of the Unified Process in an attempt to answer the above question. 7.1
Iterative and incremental
In iterative software development the project is broken down into parts that results in increments being developed and then integrated to the rest. Adding Z does not add anything further in supporting this characteristic when iterations constitute increments in functionality. However, when we talk about increments in refinements Z will help us. Z encourages the developer to specify software abstractly. He or she may be able to develop simpler versions first and then more refined versions with other characteristics such as better performance. This is also a characteristic of the cleanroom process methodology. 7.2
Risk driven
How Z supports the risk driven characteristic of the Unified Process is not clear. One can argue that it reduces risks since the specification is less ambiguous. When requirements elicitation sessions are observed, one sees that it can be time consuming when users or buyers ask the same questions. A written specification remedies this problem partially, but an informal text is not enough since developers tend to leave gaps in the specifications. Second, a formal specification may decrease risk since more time is spent on specification
and thus the developer is more likely to discover attributes and functions of the system. One can also counter argue that formal specification can increase risk since there is possibly not enough knowledge or training to apply it. 7.3
Architecture driven
The Unified Process encourages developers to consider architecture early in the software life cycle. UML supports the specification of architecture through deployment diagrams and collaboration diagrams where collaboration between subsystems is shown. Z specifications can be added by placing transformations on edges that denote collaborations between components or subsystems. Architecture can further benefit from Z specifications by writing constraints on non-functional requirements such as performance, security or usability. Further support of Z to the architecture description has not been studied. 7.4
The Unified Process
The second part of our exercise is to look at the various models of the Unified Process and describe how an embedded Z specification will change them and where. In order to have a complete analysis, one can look at a matrix where the models make up rows and the phases are columns. The models are: use case models, analysis, design, deployment, implementation and testing. The phases are inception, elaboration, construction and transition. As described in the Unified Process there are various amounts of effort carried out of the models in the different phases. For example, there is most emphasis on the use case model during inception, analysis model during elaboration, design and deployment models in late inception and early elaboration, and implementation during construction. The testing model is used in all the phases and verifies that the transition between models is correct as well as tests that the functional and non-functional requirements are met. Use cases hat are written during inception will benefit from having post-conditions written in Z as long as the major sets have been identified. This is the most drastic change since use cases have often not been specified very formally. The use case model constitutes the external view of the system, i.e. the contract between the buyer and the seller. Thus, it can be valuable to make it less unamb iguous with formal specification. A formal specification is traditionally most heavily used during elaboration where we perform analysis and start to design. Analysis can be described in the language of the
developer and thus we have more freedom to use a formal specification language. Since a design is a refinement of analysis, it is only natural that we include formal specification there too. Although experience has shown that as we go nearer to design and the technical environment of the system, there may be less need for formal specification because of use of function from libraries, well known data structures and components. This is particularly true where we use ready-made components. In other cases the developers requires that Z specifications can be written along with the programs as comments. During design, there is more emphasis on breaking down activities into methods with pre- and post conditions. Furthermore, we have more behavioural diagrams such as state and interaction diagrams. If the design is critical such as in safety or security systems or commerce systems that require reliability because of possible monetary loss, we will be forced to specify the design formally. The implementation model is the most concrete one and most formal. Traceability to implementation is vital in order to be able to carry out testing. We will not assess whether formal specification should be a part of the test model. It is rather that the formal specification should be input to the testing model. The Cleanroom method has a good model for verification since it allows you to specify any sentence in a function as a set of states with predicates that captures the semantics of that state.
8
Conclusion
This paper has presented an approach to augment UML with Z with emphasis on describing its use in the software life cycle. Students’ experience with using each of the methods has motivated the work. In our curriculum, we think it is important to teach students formal specification of software. It has proven a little difficult to do so in isolation of other methodologies such as UML that we teach. As is pointed out in [9], it is important that software developers receive mathematical training as a part of formal education since they are not likely to perceive it as an important skill to add later on during continuing education. Further work will demonstrate the ideas set forth here in examples and project assignments. When more experience will be gained, an evaluation of whether the requirements from section two have been met will be done. At the onset, we have stated that abstraction and a well-defined software process are keys to successful development. Experience shows that this also helps students with specification. In [9] Bertrand Meyer sets forth important body of software engineering
knowledge. The list of principles contains items that we have studied in this paper such as abstraction, typing and reuse but the list also includes items that we have left out. Further work should address how these items such as designing for change and exception and debugging are crystallised in UML augmented with Z.
9
References
[1] Woodcock, Davies, Using Z, Prentice Hall, 1996 [2] Pooley, Stevens, Using UML, Addison Wesley, 1999 [3] J. Rumbaugh, M. Blaha, W. Premerlani, F. Eddi, W. Lorenson, Object-oriented modeling and design, Prentice Hall, 1991 [4] I. Jacobsson, G. Booch, J. Rumbaugh, The Unified Software Development Process, Addison Wesley, 1998 [5] C.J. Trammell, R.C. Linger, J.H. Poore, Cleanroom Software Engineering: Technology and Process , SEI Series in Software Engineering, 1998 [6] J. Warmer, A. Kleppe, The Object Constraint Language, Precise Modeling with UML, Addison Wesley, 1999 [7] M. Vaziri, D. Jackson, Some Shortcomings of OCL, the Object Constraint Language of UML, Response to Object Management Group's Request for Information on UML 2.0, December 1999 [8] J. Warmer and T. Clark editors, UML 2.0 The future of OCL, Report of the workshop held on October 2, 2000 York [9] B. Meyer Software Engineering in the Academy, IEEE Computer, May 2001