Two different views about software complexity - CiteSeerX

5 downloads 83118 Views 25KB Size Report
Ana Isabel Cardoso, Rui Gustavo Crespo, Peter Kokol. Abstract. The aim of this work is to analyse the several metrics used in software engineering to assess the ...
Two different views about software complexity Ana Isabel Cardoso, Rui Gustavo Crespo, Peter Kokol

Abstract The aim of this work is to analyse the several metrics used in software engineering to assess the complexity of software products. First we discuss the differences between the complexity of the problem and the complexity of the solution (implementation) setting up several metrics that can be used in measuring both of above aspects. We define what is the difference in complexity between a problem and a solution.. Next we describe the software development process as a series of successive transformations. The first model constructed is the requirements document and the last model is the code, which represents the particular solution. Then we analyse the relation between complexity, effort and efficiency and we will describe some metrics, which can be used to measure them. We notice that conventional software metrics doesn't reflect the all aspects of problem/solution complexity. It is shown that no single metric alone could measure all the complexity aspects of the product. Finally we describe the metrics used in the complexity fields like algorithmic complexity, logical complexity, information theory, Shannon entropy, statistical complexity, grammatical complexity, etc. We compare each one with others and show that the last ones are suitable for measuring the global aspect of complexity in the software problem/solution. We conclude that it is possible to use complexity metrics used in complex systems in software engineering with some advantages.

1. Complexity of the problem and it’s solutions . According to the Many Lehman software process definition,[19] a solution of a problem is an operational model obtained from a set of transformations applied to the initial model of that problem. So behind each solution there is a problem. However the complexity of the problem differs from the complexity of the solution. Complex solutions may exist for simple problems. In this paper we will use the following definitions: [9] [5] [16] [17] [18] [31] The complexity of a problem is the amount of resources required for an optimal solution of the problem . The complexity of a solution is the amount of resources required for that solution of the problem . In order to evaluate the complexity of a solution several types of metrics are frequently used: [9] [5] ? ? Algorithmic Complexity Is required to reflect the algorithmic complexity in each solution. There are two types of metrics in this class: one is the complexity based on the algorithmic efficiency, the other is the complexity on the structure of the algorithm. ? ? Cognitive Complexity Is required to reflect the amount of effort needed to understand the problem. In order to assess problem complexity [8], metrics of Computational Complexity are used.

433

2. Some Metrics of solution complexity The assessment of effort necessary to implement a product is frequently measured on the algorithm. It may be calculated based on the Algorithmic Efficiency. We measure efficiency in terms of the number of primitive operations required for a given output. Using the “Big O” [9] notation, the algorithm is considered to have a constant complexity when it's efficiency is O(I), logarithmic complexity when it’s efficiency is O(logn), and so on. The reasoning behind this is that the fewer primitive operations are needed the less complex is the algorithm. It can also be calculated based on the structure of the a algorithm. There are tree types of Structural Metrics [30] [5] . ? ? The control flow that addresses the sequence in which instructions are executed in a programme ? ? The data flow that follows the trail of data as it is handled by a programme ? ? The data structure that evaluate the organisation of the data itself. To measure control-flow structure complexity of an algorithm, it must be transformed in to a flowgraph [9] [20] [5] [32] . It has been proved that every flow graph has a unique decomposition into a hierarchy of primes. The so-called Hierarchical Metrics are constructed based on that tree. They are: ? ? The number of nodes ? ? The number of edges ? ? The largest prime measure ? ? The d-structured measure One of the best known metrics of this type is the McCabe Cyclomatic Complexity Measure, it is also calculated on the flow graph and represents the number of independent pahts needed to reach each node on the graph. [9] [5] [6] An Essential Complexity metric was also produced by McCabe in order to measure the overall level of the algorithmic structuredness. A more intuitive notion of essential complexity may simply be the cyclomatic number of the largest print in the decomposition tree. Several other metrics were proposed for this type of measurement. Vinap [22] and Knot [23]metrics can be studied under the given references. Another approach used to measure structural complexity is to measure the testing difficulty of the model. Those metrics measure[24] the minimum number of test cases and he test effectiveness ratio. In order to the measure the data flow structure complexity [9] [5] we analysed the relationship between the models of the programme . We worked on the transformation of the design in a call-graph, which represented the modular hierarchy and showed the exchange of information between the modules. Other types of diagrams [9] [5] [25] are used to analyse the exchange of data inside the modules. Metrics of this type are: ? ? Global modularity in which the average module length is calculated. This is based on the reasoning that the shorter are the modules the lesser the complexity .

434

? ? Tree impurity shows how far a given graph deviates from being a tree. This is based on the reasoning that the more distant is a call graph from a tree the more complex it is. Coupling and Cohesion [9] [5] [26] metrics of the modules can also be used as complexity metrics: designs showing tight coupling and lack of module cohesion are complex. We can also use the total level of information flow through a system and the total level of information flow between individual modules and the rest of the system. As an example: The information flow complexity that uses the length , the fan-in and fan-out of the module, whose authors are Henry and Kafura. [9] Based on this method Shepherd [27] developed the Shepherd complexity, which is concentrated solely on the flow information of a module. There is a high correlation between this measure and the development time. The reasoning behind this is that the more complex is a module the longer is the time required for it’s development. Metrics analogous with the algorithmic structure were used to measure data structure complexity [29] [28] .Thus each data type was considered as a primitive and the resulting primitive hierarchy was measured for data structure.

3. Metrics of Problem Complexity To find the optimum solution is the relevant issue in measuring problem complexity. So we must use compression techniques. We compress data in order to use less memory in order to stock it and less time for it’s manipulation. The reasoning behind the compression process consist in finding strings of frequently used characters and substituting them by a shorter code. Problem complexity is defined as the number of bites that can represent a product after compression.

4. Difficulties with complexity software metrics In the field of product and software process the existing individual complexity metrics described in previous sections are not successful. In our opinion the main reasons are: ? ? Metrics are language and form dependent: a different metrics has to be used for different programming languages, different metric for the machine code, object code, etc. Software can be represented in many forms: requirements, specification, documentation, user interfaces, help files, and all that representations can be manifested in very different appearances: written text, graphical, symbolic, formal languages, etc. – again many different (incompatible and incomparable) metrics has to be used to measure all of them. As a consequence we are not able to measure the software in the holistic manner, compare various products, trace complexity trough the design steps, etc. ? ? The output of a traditional complexity metric is a number, usually without any “physical” meaning and unit.(we don’t know what are we measuring), without critical values indicating what is large or small – no fundamental conclusions can be deducted or induced, ? ? The relations metrics ? software are rarely known – thereafter such metrics are a poor basis for stating fundamental laws. So in order to improve the software process we need other types of metrics.

435

5. Complexity According to Morowitz the complex systems share certain features like having a large number of elements, possessing high dimensionality and representing an extended space of possibilities. Such systems are hierarchies consisting of different levels each having its own principles, laws and structures. The most powerful approach for studying such systems was reductionism, the attempt to understand each level in terms of the next lower level. Weakness of the reductionistic approach is that it is unable to explain how properties at one level emerge from the next lower level and how to understand the emergence at all. The problem is not the poverty of prediction, but its richness. The operations applied to the entities of one level generate so enormous many possibilities at the next level that it is very difficult to conclude anything. This demands radical pruning and the main task of a complexity as a discipline is to find out common features of pruning (or more generally selection) algorithms across hierarchical levels and diverse subject matters. Like systems in management science and economics, software development is a complex, dynamic, non-linear and adaptive, consequently we need to gain fundamental understanding of the software product and process using the science of complexity, theory of chaos, fractals and other not yet tried “physically based” methodologies

6. Complexity and Computer programmes Computer programmes, including popular information systems, usually consist of (or at least they should) number of entities like subroutines, modules, functions, etc., on different hierarchical levels [4] [7] Concerning “laws of software engineering” or the concepts of programming languages the emergent characteristics of above entities must be very different from the emergent characteristics of the programme as the whole. Indeed, programming techniques as stepwise refinement, top-down design, bottom up design or more modern object oriented programming are only meaningful if different hierarchical levels of a programme have distinguishable characteristics. So there is another more complete way to asses holistic complexity using Fractal Metrics or Entropy Based Metrics [10] [15].

7. Quantitative properties of Holistic Complexity Great many quantities [1] [3] [2] have been proposed as metrics of complexity. Gell-Mann suggests there have to be many different metrics to capture all our intuitive ideas about what is meant by complexity. Some of the quantities are computational complexity, information content, algorithmic information content, the length of a concise description of a set of the entity’s regularities, logical depth, etc., (in contemplating various phenomena we frequently have to distinguish between effective complexity and logical depth - for example some very complex behaviour patterns can be generated from very simple formula like Mandelbrot’s fractal set, energy levels of atomic nuclei, the unified quantum theory, etc.- that means that they have little effective complexity and great logical depth). A more concrete measure of complexity, based on the generalisation of the entropy, is correlation, which can be relatively easy to calculate for a special kind of systems, namely the systems which can be represented as strings of symbols. [11] [12] [13] [14].

8. Conclusions We conclude that it is possible to use holistic complexity metrics in software engineering with following advantages: 1. To have a holistic view of the product that enable us to control the process more efficiently.

436

2. To compare several solutions in terms of complexity and information content and find the most optimal one (to have as more as possible information with as less as possible complexity). 3. To have more accuracy because in this metrics all the levels of the solutions are taken into account, then we can make better estimation about the development cost time and faults.

References [1] [2] [3] [4]

Pines D (Ed.): Emerging syntheses in science, Addison Wesley, 1988. Morowitz H: The Emergence of Complexity, Complexity 1(1):4, 1995. Gell-Mann M: What is complexity, Complexity 1(1):16-19, 1995. Cohen B, Harwood W T, Jackson M I: The specification of complex systems, Addison Wesley, 1986. [5] Conte S D, Dunsmore H F, Shen V Y: Software engineering metrics and models, Benjamin/Cummings, Menlo Park, 1986. [6] McCabe G P: Introduction to The Practice of Statistics, Freeman, 1993. [7] Watt D A: Programming Language Concepts and Paradigms, Prentice Hall, 1990. [8] Wegner P, Israel M (Eds.): Symposium on Computational Complexity and the Nature of Computer Science, Computing Surveys 27(1)5 - 62, 1995. [9] Fenton N E: Software Metrics: A Rigorous Approach, Chapman & Hall, 1991. [10] Harrison W: An Entropy-Based Measure of Software Complexity, IEEE Trans. on Software 18(11)1025:1029. [11] Kokol, P., Kokol, T., “Linguistic laws and computer programme”, Journal of the American Society for Information Science 47(10), 1996, pp. 781-785. [12] Kokol, P., Brest, J., Žumer, V., “Long-range correlations in computer programme”, Cybernetics and systems 28(1), 1997, pp. 43-57. [13] Kokol, P., Brest, J., “Fractal structure of random programme”, Sigplan Notices 33(6), June 1998a, pp 33-38. [14] Kokol, P., Podgorelec, V., Zorman, Milan, Pighin, Maurizio: Alpha - a generic software complexity metric: Project control for software quality, ESCOM 1999, pp. 397-405. [15] Samadzadeh-Hadidi, M. Measurable Characteristics of the Software Development Process Based on a Model of Software Comprehension. Dissertation, University of Southwestern Louisiana, USA, May 1987 [16] Zuse H., History of Software Measurement, Berlin 1998. [17] Halstead, M. H. (1997). Elements of Software Science. Prentice-Hall, Inc., New York, 1977 [18] Hill, Peter (1999) Edited and compiled by: "Software Project Estimation: A Workbook for Macro-Estimation of Software Development Effort and Duration", ISBSG. [19] M.Lehman and all; PROGRAMME EVOLUTION ,Academic Press ,London ,1985 [20] Pressman, Roger S., “Software Engineering”, A Practitioner’s Approach, McGraw Hill 1994 [21] Fenton and Whitty 86 – Fenton, N.E. and Whitty, R.W., “Axiomatic approach to Software metrication through programme decompasition,” Computer Jornal 29 (4), pp. 329 – 39, 1986. [22] Bache 1990 – Bache , R., Graph Theory Models of Software, PhD thesis, South Bank University, London, 1990 [23] Woodward , M.R., “Difficulties using cohesion and coupling as quality indicators”, Software Quality Journal, 2(2), pp. 109-28, 1993 [24] Bache, R. and Mullerburg, M., “Measures of testability as a basis for quality assurence”, Software Engineering Journal, 5(2), pp. 86-92, 1990 [25] Bieman and Debnath, N.C., N.C., “An analysis of Software structure using a generalized programme graph”, Proceedings of the IEEE-CS 9th International Computer Software and Applications Conference (COMPSAC 85), PP.254-9, 1985. [26] Yourdon, E. and Constantine, L.L., Structured Design, Prentice Hall, Englewood Cliffs, NJ, 1979. [27] Shepperd, M.J. and Ince, D., Derivation and Validation of Software Metrics, Clarendon Press, Oxford, UK, 1993.

437

[28] Elliot, J.J., Data complexity aspects of software, Alvey Project SE/69, PRRM South Bank Polytechnic, London, 1988 [29] van den Berg, K.G. and van den Broek, P.M., “Static analysis of functional programmemes”, Information and Software Technology, 37(4), pp. 213-24, 1995. [30] Whitty, R.W. and Lockhart, R., Structural Metrics, ESPRIT 2 project COSMO, document GC/WP1/REP/7.3, Goldsmiths’college, London, 1990. [31] Weyuker, E.K., “Evaluation software complexity measures”, IEEE Transactions on Software Engineering, SE-14(9), pp. 1357-65, 1988. [32] Zuse, H., Software Complexity: Measures and Methods, De Gruyter, Berlin, 1991

438