Theoretical and practical complexity of modeling ... - ACM Digital Library

3 downloads 29 Views 142KB Size Report
rally possesses a considerably large number of constructs that make each diagram what it is, and differentiates and/or connects it from/to the other diagrams.
THEORETICAL AND PRACTICAL COMPLEXITY OF

MODELING METHODS Highlighting the importance of taking into account typical usage when estimating the complexity of a systems development method.

D

espite the introduction of UML 2.0, UML 1.X remains the workhorse of many object-oriented development efforts. UML 1.X consists of nine distinct diagramming techniques that support OO systems development; Use Case, Class, Activity, Statechart, Collaboration, Sequence, Object, Component, and Deployment Diagrams. Each of these diagrams necessarily and naturally possesses a considerably large number of constructs that make each diagram what it is, and differentiates and/or connects it from/to the other diagrams. Despite the standardization of UML by the Object Management Group, researchers and practitioners have often criticized UML’s complexity and the ambiguity of its constructs [10]. A set of complexity metrics developed by Rossi and Brinkkemper [8] was used in [11] to analyze the nine diagramming techniques in UML and compared them to other modeling methods.

By John Erickson and Keng Siau

I

n addition, the recent and ongoing proposals to programming languages, analysis techniques, or perrevise and enhance the Unified Modeling Lan- haps even other fields and disciplines. Before we can guage (such as UML 2.0) can also be seen at least discuss levels of complexity, we must consider a few in part as yet another attempt to convince pro- ideas that might shed some light on the core issues grammers, developers, clients, and educators to facing system developers (or anyone else dealing try to achieve the goal of executable modeling. with varying levels of complexity in business). Systems are becoming more complex, probably at Executable model capability means developers would, least partially because of with the push of a button, such influencing factors as transform models develRossi and Brinkkemper Metrics required and enhanced oped during the Systems n (OT) – count of object types per technique functionality (Web interAnalysis and Design porn (RT) – count of relationship types per technique faces), interoperability tion of the systems develn (PT) – count of property types per technique – (runs on different platopment process into PO (MT) – average number of properties per object type – forms), security, as well as working applications; this PR (MT) – average number of properties per relationship type – a variety of other reasons. has been been a highly RO (MT) – number of relationship types possible per object type – Still other trends that desirable goal for some C (MT) – average complexity for entire technique impact the size and comdevelopers for at least the C’ (MT) – total conceptual complexity of the technique plexity of applications last 20 years. While exeTable 1. Eight of include systems such as enterprise resource planning, cutable models may or the Rossi and Brinkkemper supply chain management, and customer relationship may not be the real end metrics and their goal of the recent revicomposition. management. These types of systems are extremely large and complex, and require not only close internal sions to UML, one has Erickson table 1 (8/07) only to look at the new UML to observe one fairly evi- cooperation for the implementing organizations indident characteristic: UML is larger, more complex than vidually, but also external cooperation and connection other OO modeling techniques, and we argue here, to their business partners up and down the supply chain, all the way from end customers to raw materimore difficult to learn and use. als suppliers. Even such applications as operating sysTHE SETTING tems have become larger and increasingly complex. It is becoming increasingly difficult to develop useSystems development has traditionally been driven ful and secure applications and systems. While UML by the expediencies inherent to development in the is simply used as an example here, the ideas we pro- first place, including such forces as time and money. pose can be applied to other modeling languages, This situation has led to a wide variety of develop-

COMMUNICATIONS OF THE ACM August 2007/Vol. 50, No. 8

47

ment methods, some aimed at specific application because researchers believe complexity is closely types or sizes, and others of a more general nature. related to how easy a specific method is to use, and The mere existence of a large number and variety of also how easy the method is to learn. One of Rossi development methods implies that one size does not and Brinkkemper’s more crucial caveats is that the measured complexity of a given fit all, or perhaps conversely, that a – system does not solely translate development method claiming to Diagram C’ (MT) C (MT) into less complex methods being fit all or even more than a few 0.10 26.40 Class superior to more complex methapplications will be necessarily very 0.17 10.39 Use Case ods (that is, a “good” or “bad” large and complex to be capable of 0.13 7.87 Sequence method could be “good” or “bad,” doing what it says it will do. 0.09 15.39 Statechart regardless of how complex it is). 0.11 15.65 Component herefore it should not Table 1 shows eight of the metrics 0.13 11.18 Activity seem unreasonable to and their composition [8]. 0.14 8.12 Collaboration assume that systems Siau and Cao’s [11] research 0.20 9.95 Deployment development has applied Rossi and Brinkkemper’s 0.33 5.92 Object become commensucomplexity metrics to UML and Legend rately more complex C– (M ) – average complexity for entire technique compared UML’s complexity with simply to keep pace with the C’ (M ) – total conceptual complexity of the technique 36 modeling techniques from 14 increased complexity of the applimethods, as well as each of the 14 cations being developed, and moreover, if the devel- Table 2. Results methods in aggregate, finding that from Siau and Cao. opment process has not become more complex, we UML is from two to 11 times more might want to question why. Kim, Hahn, and Hahntable 2 (8/07) complex than other modeling Erickson [5] indicated it is generally the case that as systems approaches. Table 2 shows a number of Siau and become more complex, so do the diagrams (models) Cao’s results. that represent those systems in the development We can make (at least) two observations related to process. the complexity metrics developed by Rossi and Between 1989 and the mid-1990s, up to 50 differ- Brinkkemper [8] and adopted by Siau and Cao [11]. ent OO analysis and design methods appeared. While Rossi and Brinkkemper developed, used, and presome of these methods were intended for specific and sented what we define as the theoretical complexity of limited applications or system the modeling techniques. We furtypes, a number of them purported ther observe that theoretical comto be adequate for a wider variety plexity is the maximum value that Total Theoretical Complexity of applications. If companies are to complexity can assume using those 106 choose and utilize the best method definitions, because the metrics are for their organization in terms of related to or use all of the defined systems development, they need a constructs of the modeling techmeans to compare these methods nique to which they are applied as a [12]. In order to make sense of the measure of complexity (using the Measured Practical Complexity huge number of available methods, Rossi and Brinkkemper measures). 41 Rossi and Brinkkemper [8] develTheoretical complexity therefore oped a set of metrics that would represents the upper limit of commeasure the different methods, and plexity, because the metrics were All possible UML constructs [11]. provide a means of comparison, at formulated based on the total numSubset of UML constructs selected by Delphi participants [10]. least in the area of complexity. The ber of objects, relationships, and relative complexity of a developproperty types defined in the modFigure 1. A eling techniques. In other words, all of the metrics are ment method assumes importance perspective on mathematically related to the total numbers of conwhen considering usage and develcomplexity in opment costs. used in the modeling technique. Erickson 1 (8/07) practice.fig structs

T

T

T

1

2

1 2

THEORETICAL COMPLEXITY Rossi and Brinkkemper’s [8] metrics are based on metamodeling techniques, and purport to measure the complexity of the method under analysis. According to [8], complexity is critical to measure 48

August 2007/Vol. 50, No. 8 COMMUNICATIONS OF THE ACM

PRACTICAL COMPLEXITY We propose practical complexity as a subset of theoretical complexity in that as people use a modeling language, they do not always, nor likely most often use all of the available (or possible) constructs in the

language. This is analogous to the idea that while the For example, when a group of users was asked to English language consists of hundreds of thousands understand the class diagram, which can have many of words, most English-speaking people only use a object, relationship, and property types, the users small fraction of these in their day-to-day discourse. would not attempt to understand every element in the Similarly, there are a huge number class diagram simultaneously. of functions provided in Microsoft Instead, the users would decomWord but most people only use a pose the diagram into manageable Real-Time small number of them in writing sub-diagrams and understand each Systems and formatting their documents. sub-diagram in turn. In this case, Possible Therefore, we propose equating Total the short-term memory limitation Constructs practical complexity with a useis partially overcome [2]. Current Common based core (kernel) of the language complexity metrics do not take this Constructs or modeling system in use; we will into account. We argue here that Enterprise Web-Based Systems Systems return to this idea later. the ability for us to decompose complex problems into sub-probWHY PRACTICAL COMPLEXITY? lems should be factored into comThere are several possible reasons plexity metrics formulation. for the inadequacy of theoretical An additional reason for the complexity in estimating complex- Figure 2. Complexity inadequacy of theoretical complexity estimation is that ity in practice. Although Siau and related to different there might be a need to assign weights to different types of systems. Cao’s [11] complexity indices indiconstructs. Erickson fig 2 (8/07) For example, a construct that is more likely cated that class diagrams are about to result in a short-term memory problem should be 2.5 times more complex than use case diagrams, the assigned more weight in the complexity metrics than analysis was based on all objects, relationships, and one that is less likely to result in a short-term memory property types as formulated by Rossi and problem. With respect to UML, for example, we Brinkkemper [8]. In practice, not all constructs in would argue that objects are less likely, when compared each diagram are used all to relationships, to result Standard % “Yes” the time. For example, in short-term memory Construct Mean Deviation for Kernel the class diagram can constraint as they are more Class 1.00 0.00 100.0% contain many relationor less “independent.” Use Case 1.61 0.79 90.9% ship types (association, Relationships, on the Sequence 1.73 0.70 95.5% aggregation, composiother hand, must be interStatechart 1.81 0.51 100.0% tion, generalization, preted with associated Component 2.31 0.70 31.8% dependency) and objects objects to make sense. Activity 2.41 0.55 27.3% (abstract class, notes, conHence, one relationship Collaboration 2.57 0.87 22.7% straints, packages, subsyswill consume more shortDeployment 2.69 0.75 9.1% tems, interface), but a term memory resources Object 3.00 0.86 9.1% typical class diagram only than an object [2]. uses a subset of these. By Table 3. The most important diagrams according to inally, and perhaps most importantly, we including all constructs Delphi participants. whether they are used or Erickson table 3 (8/07)may also argue from an 80/20 perspective that practical complexity is more relevant not as a measure of comto systems development than theoretical plexity, theoretical complexity may not be the best complexity. In essence, the 80/20 rule of measure of the complexity a user encounters in practhumb [6] says that 80% of common softtice. Figure 1 shows the relationship between theoware solutions (software development projects) can retical and practical complexity. Another one of the typical reasons that complexity usually be completely specified by using only 20% of is a problem for people is the limited nature of short- the language constructs. If that is true, then we proterm memory. Miller [7] argued that the primary bot- pose that only the most commonly used constructs tleneck in human cognition is our (limited) ability to constitute the majority of software development store seven-plus-or-minus-two chunks of information efforts, and that (approximately) 20% of the language in short-term memory. Although short-term memory should therefore define practical complexity. Moreis a concern, decomposing a complex problem into over, if many of the constructs are rarely or ever used, sub-problems can help to alleviate the limitation [9]. it would not be necessary to learn the complete syntax

F

COMMUNICATIONS OF THE ACM August 2007/Vol. 50, No. 8

49

Although formulating measures of practical complexity is difficult, and PRACTICAL COMPLEXITY DEPENDS ON MANY CIRCUMSTANCES (project domains, structured/semi-structured/unstructured), determining an estimation of practical complexity is possible and useful.

of the language in order to develop the majority of systems. The proliferation of different types of systems also impacts the area of complexity. For example, using UML, real-time systems might more heavily use a set of constructs that deals more with timers, clocks, and state changes, such as a those presented in Statechart diagrams in UML, and enterprise systems might depend more upon portraying more abstract and higher-level Activity diagrams, while Web-based systems might lean more toward some combination of the two. Figure 2 depicts the basic idea relating complexity to different system types. RELATED RESEARCH A Delphi study conducted by the authors in 2004 investigating practical complexity provides the following details. The Delphi study assembled a panel of 29 globally diverse UML expert practitioners who were asked to identify the most important and most useful UML diagrams and within each diagram, the most important constructs. They were also asked to identify a use-based kernel of UML, which can be equated to practical complexity. The experts clearly identified a UML kernel and established a basis for the ideas proposed herein. The results for the nine UML diagrams are summarized in Table 3. Individual diagram results are also available, but are not shown here due to space limitations. The mean indicates the arithmetic average of the participant ratings on a 1–5 scale, the standard deviation a measure of dispersion, and the percentage Yes for Kernel a measure of the agreement or consensus level among the participants after three rounds of the Delphi.

A

fter three rounds, the Delphi participants identified the most important diagrams as Class, Use Case, Sequence, and Statechart (see Table 3). In addition, at least 90% of the assembled experts agreed that those four diagrams should comprise a UML kernel. Applying the Rossi

50

August 2007/Vol. 50, No. 8 COMMUNICATIONS OF THE ACM

and Brinkkemper metrics to the kernel naturally results in a lower complexity assessment, which is not at all surprising. However, if the kernel diagrams truly represent the portion of UML that most people commonly use, then the reduction in measured complexity can be considered as one possible surrogate for practical complexity, perhaps a more realistic complexity than other metrical approaches. CONCLUSION We have argued here that theoretical complexity might not accurately predict complexity in practice. We have completed research that uses an existing metric set to estimate the complexity in practice of modeling methods. Although formulating measures of practical complexity is difficult, and practical complexity depends on many circumstances (project domains, structured/semi-structured/unstructured), determining an estimation of practical complexity is possible and useful. For example, the Function Point Analysis [1], Constructive Cost Model [3], and Activity-Based Costing model [4] are illustrations of the usefulness of estimation— even rough estimation. From a practical perspective, it seems relatively certain that if increased complexity is a characteristic of new systems and systems development methods, then even more expertise will be required of the developers and the organizations and companies for whom the systems are being developed. However, does the increase in size and number of constructs of a modeling method really affect its complexity in practice? Do we use all the functions in Microsoft Excel for every spreadsheet analysis? Obviously we do not, but those functions are there if we need them, and as we gain proficiency with the basic functions, it usually becomes easier for us, through automaticity and task decomposition, to learn other functions (or new software releases) as we find it necessary, and work around our cognitive limitations. In measuring the complexity of a systems development method, it is incomplete and misleading if we

compute the complexity based on all possible object types and relationship types in the method. We do not use all the functions provided by SPSS for each statistical analysis. Similarly, we do not use all the constructs provided in UML for all modeling tasks. To provide a more complete picture and a more accurate account of how developers use modeling methods in practice, we must take into account the actual usage and practice of the modeling method. In other words, in addition to theoretical complexity, it should be possible to provide an estimation of the practical complexity of a modeling method based on the typical usage of the modeling method. A realistic estimation of the complexity in practice of a modeling language can provide and suggest better ways of learning and using the various and sundry development methods that are currently in use or under development. c References 1. Albrecht, A.J. and Gaffney, J.E. Software function, source lines of code, and development effort prediction: A software science validation. IEEE Transactions on Software Engineering 19 (Nov. 1983), 639–648. 2. Anderson, J., and Lebiere, C. The Atomic Components of Thought. Lawrence Erlbaum Associates, 1998. 3. Boehm, B.W. Software Engineering Economics. Prentice-Hall, Englewood Cliffs, NJ, 1981. 4. Cooper, R. and Kaplan, P.S. Measure costs right: Make the right decisions. Harvard Business Review (Sept./Oct. 1988), 96–103. 5. Kim, J., Hahn, J., and Hahn, H. How do we understand a system with (so) many diagrams? Cognitive processes in diagrammatic reasoning. Information Systems Research 11 (2000), 384–303. 6. Kobryn, C. Will UML 2.0 be agile or awkward? Commun. ACM 45, 1 (Jan. 2002), 107–110. 7. Miller, G. The magical number seven, plus or minus two: Some limits on our capacity for processing information. The Psychological Review 63, 2 (1956), 81–97. 8. Rossi, M., and Brinkkemper, S. Complexity metrics for systems development methods and techniques. Information Systems 21, 2 (1996), 209–227. 9. Siau, K. Information modeling and method engineering: A psychological perspective. Journal of Database Management 10, 4 (1999), 44–50. 10. Siau, K., Erickson, J., Lee, L. Theoretical versus practical complexity: The case of UML. Journal of Database Management 16, 3 (2005), 40–57 11. Siau, K., and Cao, Q. Unified Modeling Language (UML)—A complexity analysis. Journal of Database Management 12, 1 (Jan. 2001), 26–34. 12. Siau, K. and Tan, X. Evaluation criteria for information systems development methodologies. Commun. AIS 16 (2005), 856–872.

John Erickson ([email protected]) is an assistant professor in the College of Business Administration at the University of Nebraska at Omaha. Keng Siau ([email protected]) is a professor in the College of Business Administration at the University of Nebraska-Lincoln. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. © 2007 ACM 0001-0782/07/0800 $5.00

COMMUNICATIONS OF THE ACM August 2007/Vol. 50, No. 8

51