Developing Software Metrics Applicable to UML Models Hyoseob Kim1 and Cornelia Boldyreff2 1
2
Centre for HCI Design, City University, Northampton Square, London, EC1V 0HB, UK
[email protected] Department of Computer Science, University of Durham, South Road, Durham, DH1 3LE, UK
[email protected]
Abstract. This paper proposes some new software metrics that can be applied to UML modelling elements like classes and messages. These metrics can be used to predict various characteristics at the earlier stages of the software life cycle. A CASE tool is developed on top of Rational Rose1 using its BasicScript language and we provide some examples using it.
1
Introduction
Software metrics can be used to find out the properties of the software that we are developing and predict the needed effort and development period. Many different kinds of metrics have been developed during the past few decades matching with the different programming paradigms like structural programming and object-oriented programming (OOP). Among these, “LOC (Lines of Code)” is one of the most primitive and oldest metrics. In the beginning of 1990s, Chidamber and Kemerer proposed six new object-oriented metrics to overcome the limitations of the more traditional code-based metrics [1]. They are “weighted methods per class (WMC)”, “depth of inheritance tree (DIT)”, “number of children (NOC)”, “coupling between object classes (CBO)”, “response for a class (RFC)”, finally, “lack of cohesion in methods (LCOM)”. Their metrics have certainly helped users analyse their code to some extent along with other similar OO metrics. However, as software engineers’ focus has shifted to the earlier stages of the life cycle, the shortcomings of OO code metrics like their predecessors have become more apparent. Therefore a comprehensive approach to developing and applying metrics to artifacts such as designs produced at the early stages of the life cycle is needed. In the meantime, the Unified Modelling Language (UML) was adopted by the Object Management Group (OMG) in 1997 ending the so-called “OO methods war”, and since then has become the de facto specification standard graphical language for specifying, constructing, visualising, and documenting software systems, business modelling and other non-software systems [10]. UML has been intensively used by software developers since its introduction. Many organisations are using UML as a common language for their project artefacts and have adopted UML as their organisation’s standard. As the amount of UML models produced within a organisation increased, a need for measuring their characteristics has arisen. The investigation described in this paper goes into this direction. 1
Rational Rose is a registered trademark of Rational Rose Corporation.
The overall aim of this paper is the development of software metrics that can be applied to UML models. These metrics are comparable to UML itself in such a way that it plays a role as a standardised metrics suite. This paper is organised as follows. Section 2 surveys some similar work on software metrics development to the one reported in this paper. Section 3 introduces new metrics that are called UML metrics with their definitions. In Section 4, the implementation aspects of these metrics are studied. A CASE tool called UMP (UML Metrics Producer) is developed. Finally, Section 5 draws some conclusions and suggests further work to be carried out.
2
Previous Work
Since Chidamber and Kemerer published their OO metrics suite [1], many researchers have developed new metrics based on their own experience [5]. This is due to the fact that OOP was becoming more popular in the software development scene. Also there has been a strong interest in automating the production of these metrics. CCCC (C and C++ Code Counter)2 , JMetric3 and McCabe’s metrics tool4 are among them. Most of these new metrics and tools, however, only deal with language-dependent source code that is typically available at the later stages of the software life cycle, failing to address the importance of the software artifacts produced during the earlier stages such as requirement and analysis stages. This paper reports an attempt to overcome these limitations by measuring UML models.
3
UML Metrics
3.1
UML Metamodel Architecture
UML adopts different layers of metamodel as shown in Figure 1. For instance, a class is an instance of the metaclass, Class of the UML Metamodel, which in turn is an instance of the metaclass, Class of the MOF Meta-Metamodel. At the bottom it shows how an object can be instantiated from a class of the analysis model. This kind of modelling scheme helps define different modelling elements and the relationships between them more precisely and formally. For example, Figure 2 elaborated from Figure 1 shows various relationships that can exist between classes such as generalisation, association and aggregation. The UML metrics proposed here are based on this metamodel scheme. Below the four categories of metrics are suggested, i.e. model, class, message, and use case metrics. Table 1 shows the 27 mostly new metrics to measure various characteristics of UML models.
2 3 4
Fig. 1. The UML metamodel architecture [10] Abbreviation UML Metric CBC DIT NACM NACU NAGM NASC NASM NATC1 NATC2 NCM NDM NDM* NIM NMM NMRC NMSC NMU NOM NOPC1 NOPC2 NPM NSCU NSUBC NSUBC* NSUPC NSUPC* NUM
Coupling between classes Depth of inheritance tree Number of actors in a model Number of actors associated with a use case Number of the aggregations in a model Number of the associations linked to a class Number of the associations in a model Number of the attributes in a class - unweighted Number of the attributes in a class - weighted Number of the classes in a model Number of the directly dispatched messages of a message Number of the elements in the transitive closure of the directly dispatched messages of a message Number of the inheritance relations in a model Number of the messages in a model Number of messages received by the instantiated objects of a class Number of messages sent by the instantiated objects of a class Number of messages associated with a use case Number of the objects in a model Number of the operations in a class - unweighted Number of the operations in a class - weighted Number of the packages in a model Number of system classes associated with a use case Number of the subclasses of a class Number of the elements in the transitive closure of the subclasses of a class Number of the superclasses of a class Number of the elements in the transitive closure of the superclasses of a class Number of the use cases in a model
Table 1. Software metrics for UML models
Fig. 2. The UML metamodel describing the relationships between classes [10]
3.2
Metrics for Model
Model metrics are for estimating the size or the amount of information contained in a model. A model is a reflection of a system to be developed at a later stage life cycle. Therefore these metrics are useful in that developers can predict the characteristics of their systems that have not been implemented yet. 1. Number of the packages in a model (NPM): This metric counts the number of packages in a model. Package is a way of managing closely related modelling elements together. Also by using packages, naming conflicts can be avoided. 2. Number of the classes in a model (NCM): A class in a model is an instance of the metaclass “class”. This metric counts the number of classes in a model. This metric is comparable to the traditional LOC (lines of code) or a more advance McCabe’s cyclomatic complexity (MVG) metric for estimating the size of a system [7]. Thus, in OOP this metric can be used to compare sizes of systems. 3. Number of actors in a model (NAM): According to the UML specification [10], an actor is a special class whose stereotype is “Actor”. This metric computes the number of actors in a model. 4. Number of the use cases in a model (NUM): The rationale behind the inclusion of this metric is that a use case represents a coherent unit of functionality provided by a system, a subsystem, or a class. 5. Number of the objects in a model (NOM): In a similar manner that a class is an instance of the metaclass “Class”, an object is an instance of a class. 6. Number of the messages in a model (NMM): A message is an instance of the metaclass “Message”. Messages are exchanged between objects manifesting various interactions. 7. Number of the associations in a model (NASM): An association is a connection, or a link, between classes. This metric is useful for estimating the scale of relationships between classes. 8. Number of the aggregations in a model (NAGM): An aggregation is a special form of association that specifies a whole-part relationship between the aggregate (whole) and a component part. 9. Number of the inheritance relations in a model (NIM): This metric counts the number of generalisation relationships between classes existing in a model. 3.3
Metrics for Class
Class metrics concern with various characteristics of a class such as attribute, relationship and object instantiation. Below Figures 3 is used to give specific examples. 1. Number of the attributes in a class - unweighted (NATC1): This metric counts the number of attributes in a class. This does not apply a weighting scheme, meaning public, private and protected attributes are treated equal. For example, the value of NATC1 metric of Class1 is 3 as it has 3 attributes. 2. Number of the attributes in a class - weighted (NATC2): This metric is a weighted version of NATC1. That is, it applies different weights to each metric depending on their visibility, i.e. 1.0 for public, 0.5 for protected and 0.0 for private attributes. This is more correct in a sense that the concept of encapsulation is more properly
Fig. 3. An example of class diagram and sequence diagram
3.
4.
5.
6.
7.
8.
reflected in this weighting scheme. In the example class diagram, Class1 has 3 attributes, a1, a2 and a3, whose access control properties are private, protected, and public. Thus, N AT C2(Class1) = 0.0 + 0.5 + 1.0 = 1.5 Number of the operations in a class - unweighted (NOPC1): This is an unweighted metric that counts the number of operations in a class. For example, N OP C1(Class2) = 1.0 + 1.0 = 2.0 This is due to the fact that Class2 is a subclass of Class1 inheriting the operation op1 from Class1. Number of the operations in a class - weighted (NOPC2): This metric is same as NOPC1 except different weights are applied. The exactly same weights are used as in NATC2. Number of the associations linked to a class (NASC): The number of associations including aggregations is counted. This metric is useful for estimating the static relationships between classes. For example, N ASC(Class1) = 2 as it has one association with Class3 and one aggregation with Class4. Coupling between classes (CBC): Also known as message delivery channels, this metric counts the number of associations in a class and attributes whose parameters are of the class type [11]. Messages can only be sent when an object of a class holds a reference to another object of a class. In the example, Class1 has 3 attributes. Among them, two have class type parameters, i.e. a1 and a3. Thus CBC(Class1) = 2 + N ASC(Class1) = 4. Depth of inheritance tree (DIT): The definition of this metric is exactly same as CK (Chidamber and Kemerer)’s Metric. In the example, DIT (Class4) = 2. This metric is useful for measuring the vertical hierarchy of an inheritance tree. The higher the value of DIT, the greater the chance of reuse becomes. However, a high value of DIT can cause program comprehension problem [5]. Number of the superclasses of a class (NSUPC): This counts the direct parents of a class. In a single inheritance implementation like Java, the value of this metric is either 0 or 1, whereas under multiple inheritance scheme it is greater than or equal to 0. For example, N SU P C(Class4) = |{Class2, Class3}| = 2.
9. Number of the elements in the transitive closure of the superclasses of a class (NSUPC∗ ): This counts the transitive closure of the superclasses of a class, and it is potentially useful for predicting the classes whose changes might affect this class. For example, N SU P C ∗ (Class4) = |{Class1, Class2, Class3}| = 3. 10. Number of the subclasses of a class (NSUBC) : This counts the direct children of a class. In the example, N SU BC(Class1) = |{Class2}| = 1. 11. Number of the elements in the transitive closure of the subclasses of a class (NSUBC∗ ): In the example, N SU BC ∗ (Class1) = |{Class2, Class4}| = 2. 12. Number of messages sent by the instantiated objects of a class (NMSC): This metric concerns how many messages are sent by the objects instantiated from the class. This can be used for finding out which classes are actively involved in interactions within a system. In the example, Class1 has one instance of it, i.e. Object1 in its associated sequence diagram, and it sends two messages to Object2. Thus NMSC value for Class1 is 2. 13. Number of messages received by the instantiated objects of a class (NMRC): This metric is similar to the RFC metric of CK’s. In the example, Object1 receives one message from Object2, thus NMRC value for Class1 is 1. Because implementation details are not usually known at the early stages of lifecycle, the cohesion metric cannot be defined in UML metrics. 3.4
Metrics for Message
In UML, messages sent between objects constitute interactions. The following new metrics are defined in order to measure the degree of interactions. 1. Number of the directly dispatched messages of a message (NDM): According to the UML semantics, a message can be an activator of other messages. For example, in Figure 3, the message op1() activates the message op3(), thus its NDM value is 1. 2. Number of the elements in the transitive closure of the directly dispatched messages of a message (NDM∗ ): For example, N DM ∗ (op1()) = |{op3(), op2()}| = 2. 3.5
Metrics For Use Case
Originally proposed by Jacobson, use case is now fully integrated into UML [3, 10]. The RUP (Rational Unified Process) part of UML is based on the use case concept [4]. A use case captures a contract between the stakeholders, also called primary actors of a system about its behaviour. The use case gathers the different sequences of behaviour, or scenarios together [2]. In short, use case is a good way of eliciting requirements at the earlier stages of software development. 1. Number of actors associated with a use case (NAU): This metric computes the number of actors that are associated with a use case, and it is useful to measure the importance of the requirement expressed by the use case. The reason for this argument is that the requirements that many actors concern are likely to be important for the system to function properly as a whole. Note that we do not count
normal system classes for this metric because this metric concerns the interactions between a systems and its stakeholders. 2. Number of message associated with a use case (NMU): As explained before, a use case is further refined through its scenario. In UML, there are two scenario diagrams, i.e. sequence diagram and collaboration diagram. These two kinds of scenario diagram are completely isomorphic meaning one kind of diagram can be automatically replaced with another kind without the loss of information contained in it. The NMU metric counts the number of messages comprising the scenario of a use case. This metric is useful for tracing requirements into design-level elements. 3. Number of system classes associated with a use case (NSCU): This metric counts the number of classes whose objects participate in the scenario of a use case. Note that this metric does not include actors as this is done with the NAU metric. Like NMU, NSCU is good for estimating the impact of a requirement change onto the system. Any changes of use cases spread to classes and the interactions of their objects, and vice versa.
4
Tool Support: UML Metrics Producer (UMP)
To facilitate the use of the proposed UML metrics, a CASE tool UMP was developed on top of Rational Rose using its BasicScript language and API (Application Programmer’s Interface). This helps reduce the heavy burden of implementing a UML environment. The size of UMP is just 1152 lines of code including in-line comments resulting from using a scripting language. Figures 4, 5, 6, 7, and 8 show how UMP works on a model to produce various metrics. UMP has a report generating facility that details all the four kinds of metrics data in the XML format as demonstrated in Figure 9. This metrics documentation can be kept throughout the development process and further during software maintenance and evolution period.
Fig. 4. The main menu of UMP
Figure 10 shows the context in which UMP is used by a developer with a UML model. A model is processed by UMP and metrics data are obtained that can be viewed interactively on a screen or viewed using an XML-enabled web browser like Microsoft Internet Explorer.
Fig. 5. Obtaining model metrics with UMP
Fig. 6. Obtaining class metrics with UMP
Fig. 7. Obtaining message metrics with UMP
Fig. 8. Obtaining use case metrics with UMP
Fig. 9. A metrics report in XML generated by UMP
Fig. 10. UMP Context
Classes are sorted alphabetically so that users can locate them easily. With regards to messages, they often have same signatures. Therefore, artificial sequence numbers were added to each message to indicate the order of their dispatch in the scenario diagram they belong to. These numbers uniquely identify messages in a diagram but they are not taken into account in any of the computations described in this paper. At present, any inconsistencies between the modelling elements [12, 8] in the UML are not dealt with by the UMP tool. However, there are tools which check the consistency of UML models, for example, the xlinkit UML Checker5 . A possible extension to UMP would be to employ such a checker as a preprocessor to identify any inconsistencies and eliminate these before UMP is applied.
5
Conclusions and Future Work
In this paper, some new software metrics were introduced that can be used at the early stages of the software life cycle. They measure various characteristics of model, class, message, and use case. In the future, these metrics need to be evaluated using data from the industry. Also, linking the XML metrics report generated by UMP to the XMI (XML Metadata Interchange) document [9] of a UML model will be a natural extension to this research. Such existing XML technologies as XLink and XPointer [6] will be used for the linking.
References 1. Shyam R. Chidamber and Chris F. Kemerer. A metrics suite for object oriented design. IEEE Transactions on Software Engineering, 20(6):476–493, June 1994. 2. Alistair Cockburn. Writing Effective Use Cases. Addison-Wesley, 2001. ¨ 3. Ivar Jacobson, Magnus Christerson, Patrk Jonsson, and Gunnar Overgaard. Object-Oriented Software Engineering – A Use Case Driven Approach. Addison-Wesley, Wokingham, England, 1992. 4. Ivar Jacobson, James Rumbaugh, and Grady Booch. The Unified Software Development Process. Object Technology Series. Addison-Wesley, Reading/MA, 1999. 5
http://www.xlinkit.com/dangerzone/UML/index.html
5. Mark Lorenz. Object-oriented software metrics: a practical guide. PTR Prentice Hall, 1994. 6. E. Maler and S. DeRose. XML Linking Language (XLink). World Wide Web Consortium Working Draft. http://www.w3.org/TR/WD-xlink, March 3, 1998. 7. T. J. McCabe. A complexity measure. IEEE Transactions on Software Engineering, SE-2(4):308– 320, December 1976. 8. C. Nentwich, L. Capra, W. Emmerich, and A. Finkelstein. xlinkit: A consistency checking and smart link generation service. Technical Report Research Note RN/00/06, Department of Computer Science, University College London, 2000. xlinkit.com White Paper. 9. Object Management Group. XML Metadata Interchange (XMI). Technical Report ad/98-10-05, Object Management Group, February 1998. 10. Object Management Group, Inc. OMG Unified Modelling Language Specification. Version 1.3, June 1999. 11. George Spanoudakis and Hyoeseob Kim. Quantitative assessment of the significance of inconsistencies in object-oriented designs. In Proceedings of the 4th International ECOOP Workshop on Quantitative Approaches in Object-Oriented Software Engineering (QAOOSE2000), Sophia Antipolis and Cannes, France, June 2000. 12. A. Zisman, W. Emmerich, and A. Finkelstein. Using XML to specify consistency rules for distributed documents. In 10th International Workshop on Software Specification and Design, November 2000.