Qualitative Simulation of Software Evolution Processes - CiteSeerX

3 downloads 0 Views 183KB Size Report
Oct 2, 2002 - Army, Fort Belvoir VI, Dec 1977, pp 339 - 360. [Som92] Sommerville I, Software Engineering, 4th Edition, Addison-Wesley, Wokingham, UK, ...
Qualitative Simulation of Software Evolution Processes

WESS'02 Eighth Workshop on Empirical Studies of Software Maintenance, Montreal, 2nd Oct 2002

Neil Smith

Juan F. Ramil

Computing Department Faculty of Maths and Computing The Open University Walton Hall, Milton Keynes MK7 6AA, U.K. http://mcs.open.ac.uk/ns938 http://mcs.open.ac.uk/jfr46 [email protected] [email protected] Introduction Lehman's studies identified the software evolution phenomenon and led to a set of statements termed laws of software evolution [Leh85,Som92,Mdh02]. The term laws was used to highlight that they reflect forces largely independent of the technology used and outside the immediate control of the those implementing the evolution of a system. Over the years the laws have been the basis of a number of process models [Rio77,Leh02a]. The purpose of these models ranges from manpower allocation, such as in [Rio77,Leh02a], to testing the empirical support for the laws, as suggested in [Ram02]. The building of a simulation model from the laws [Rio77, Leh02a] involves refinement of an informal statement into a precisely defined, executable, model. The model builder must determine or assume rigorous relationships. Since the laws only express relationships informally, one needs to consult sources other than the laws (such as local process knowledge) or introduce assumptions to achieve a formal executable model. Moreover, the validation of the quantitative model will require the existence of data at the required level of detail/accuracy. If the focus is on long-term evolution, spanning several years or decades, the availability of precise quantitative behavioural detail cannot be taken for granted. In this regard, the empirical study of longlived applications bears some resemblance to palaeontology [Ant01]. Qualitative simulation [Kui95] is one of a series of techniques that have been develop in order to enable behavioural modelling at a higher level of abstraction than that of quantitative simulation [For61,Kel99]. In this paper we report initial results of the use of qualitative simulation to execute a simple model derived from a sub-set of the laws. Some preliminary conclusions are derived. We argue that qualitative simulation can be a useful technique for performing empirical studies of the software process and building process simulation models. Qualitative simulation has started to be applied to development projects [Sua02]. We believe that this paper reflects the first use of the technique in the context of long term evolution processes. Qualitative simulation Qualitative reasoning (QR) is motivated by the need to exploit imprecise a priori knowledge about the domain being modelled. QR is useful when relationships that govern behaviour are imprecise, fragmented and incomplete. This is the case in many medical, biological, and design situations and similarly applies to modelling the software evolution process. QR models reflect the real world at an abstract level and require many fewer assumptions during model building than with conventional techniques. Qualitative simulation [Kui95] is one qualitative reasoning technique. Instead of modelling a system as a set of ordinary differential equations (ODEs), the system is represented as a set of qualitative differential equations (QDEs). Each QDE represents a large set of possible ODEs as each M+ function represents the set of all monotonically increasing functions. In order to introduce the basics of qualitative simulation we present a simple simulation model of a bathtub. For instance, consider the bathtub model shown in figure 1. Its behaviour, in terms of the flow of water into and out of the bathtub, can be represented as by the ODEs in figure 2a. However, before a tool such as an ODE solver can be used to find the behaviour, the ODEs must be instantiated with particular values for each of the parameters and functions. In contrast, the QDEs in figure 2b can be used as-is by a qualitative simulation engine such as QSIM [Kui95].

Figure 1: A bathtub amount = f (level) pressure = ρ.g.level outflow = h (pressure) netflow = inflow – outflow d/dt amount = netflow inflow = k Figure 2a: ODEs for a bathtub

amount = M+ (level) pressure = M+ (level) outflow = M+ (pressure) netflow = inflow - outflow d/dt amount = netflow inflow = constant Figure 2b: QDEs for a bathtub

QSIM represents each variable in a qualitative manner. It records each variable's magnitude relative to a fixed set of landmark values and the sign of each variable's first time derivative. In the bathtub, the landmark values for the water level are ZERO and TOP. The initial water level is between these two values, denoted as . Given some initial values, QSIM finds all possible variable assignments that are consistent with the specified QDEs. Each of these sets of assignments is termed a state. For each state, QSIM then finds all possible (consistent) successor states. As each M+ constraint represents a large family of functions, there will normally be more than one successor for each state. Again, for each new state, all its successors are found, and the process continues until either the model becomes quiescent or an earlier state is repeated. Each sequence of states is termed a behaviour. When QSIM is applied to the bathtub model, it finds three initial states, depending on whether the part-full bath is emptying, filling or in equilibrium. It goes on to find five distinct qualitative behaviours: a filling bath can overflow, reach equilibrium before it overflows, overflow, or reach equilibrium just as the water reaches the brim. The five behaviours are illustrated by the behaviour tree in figure 3a. The five possible behaviours of the water level over time are shown in figure 3b to 3d.

Figure 3a: Behaviour tree for bathtub

Figure 3b: Behaviour 1: emptying bath

Figure 3c: Behaviour 2: constant level

Figure 3e: Behaviour 4: Figure 3d: Behaviour 3: Figure 3f: Behaviour 5: equilibrium below overflow point equilibrium as water reaches brim overflow The amount of water in the bath is shown relative to the landmarks and the values are annotated with arrows that show whether the variable is increasing, decreasing, or static at that time. The dotted lines connecting the

points are simply visual aids and have no significance. Note that QSIM automatically inserts landmark values corresponding to the initial and final amounts of water. For example, figure 3b shows that the water level decreases until reaching a steady value, but it makes no claims to the precise, quantitative behaviour of the water level. By generating all qualitatively possible behaviours, QSIM guarantees to find behaviours that encompass all physically realisable behaviours. However, the lack of detail in the qualitative model can induce some ambiguity in the results, and some behaviours generated by QSIM may not exist in the referent system. These are termed spurious behaviours. A qualitative model based on the laws Although the laws of software evolution are stated in text form (listed in the appendix), ODEs and QDEs can be derived from them. As a first step in exploring the use of qualitative simulation in modelling the functional growth of software, we have restricted ourselves to a subset of the laws: the second, fourth and sixth that relate to software complexity, work-rate and functional growth, respectively. Even when using QDEs, there are numerous possible representations of the laws. Here we select one that has a particular appeal due to its simplicity. Figure 4 shows a QDE representation of this subset of the laws. In a fuller paper we hope to be able to discuss the implications of one particular representation over alternatives. In the rest of this paper we consider the representation shown figure 4 only. This representation essentially states that as the functional size of a system increases, there is a need to perform complexity control work, termed anti-regressive [Leh85,02a; Rio77]. This concept has been explained in other models [Rio77,Leh02a]. A key concept in the representation in figure 4 is that of cumulative anti-regressive deficit. As successive versions of a software system are generated, the increase in software size and the superposition of change upon change upon change is likely to increase system complexity. This may bring with it decreasing evolvability and maintainability, a decline in the growth rate, even stagnation. In order to ensure that the software remains evolvable one needs anti-regressive work [Leh85], which includes activities such as restructuring, refactoring and documentation. However, the anti-regressive work competes with the progressive work to obtain sufficient priority and resources. Given marketplace pressures anti-regressive work is not regarded, in general, as of high priority. When resource applied to anti-regressive work is less than certain level, the neglected anti-regressive accumulates, becoming a deficit, eventually impacting productivity. Only restoration to an adequate level can reverse the growth trend and restore productivity. The QDEs in figure 4 are intended to reflect in a very simplified way the above observations that as functional growth takes place, a minimal amount of anti-regressive work is needed in order to maintain the growth rate. This concept is of remarkable simplicity and avoids the need for complexity metrics involving measurable program attributes. It is intended to support management of evolution at the total system level [Leh02a]. The QDEs in figure 4 postulate relationships between the cumulative anti-regressive deficit, C, and the rest of the attributes, whose name appears on the right hand side of the figure. Expression

Source

Attributes

C: Cumulative anti-regressive deficit Es: Effort applied to increase functional size Ec: Effort applied to control complexity Hc: Hindrance imposed by cumulative anti-regressive deficit S: Functional Size Figure 4: A QDE model of a sub-set of Lehman's laws of software evolution

dS/dt = Es – Hc Hc = M+(C) dC/dt = M+(Es) - M+(Ec) d(Es+Ec)/dt = 0

Law II and VI Law II Law II Law IV

In the rest of this paper, "C" should be read "Cumulative anti-regressive deficit". Figure 5 shows the results of QSIM simulations performed with the model of figure 4. QSIM finds three qualitatively distinct behaviours, in term of how C and the size of the software evolve over time. In the one of these, presented in figure 5, size increases, as does C. As a consequence of the increasing C, the rate of growth decreases, eventually becoming practically zero and the evolution process stagnates. In the second behaviour (not shown due to space limitation), the effort applied to complexity control activity is exactly what is needed to keep C constant. In this behaviour, the complexity remains constant and size increases without bound. In the third behaviour (also not shown), sufficient effort is applied to actually reduce C. In this situation, the software would grow at a constant rate and the C is kept equal to zero. These behaviours support

the conclusion that a certain level of complexity control work is required to prevent evolution stagnation, as suggested in other simulation models [Rio77,Leh02a].

Figures 5: One of the simulated behaviours for the model shown in figure 4. "Complexity" should be read "Cumulative anti-regressive deficit" The model in figure 4 is inadequate to represent feedback in the software process, as expressed in the eighth

law and taking the form of management control. If a software system approaches stagnation, as illustrated in the first behaviour found by QSIM, management is likely to introduce a "consolidation" phase by moving resources away from normal work towards anti-regressive work. This situation can be modelled in QSIM by introducing state transitions [Kui95] into a model. Results are shown in fig. 6.

Figure 6: Behaviour with state transitions. "Complexity" should be read "Cumulative anti-regressive deficit"

In figure 6, the initial growth stage, effort is directed towards increasing the software's size. When the software growth rate falls to some value below a trigger level, the model switches to a consolidation state where the effort is redirected towards anti-regressive work. When C falls to some sufficiently low level, the system reverts to the normal growth phase. Some simplifying assumptions in the model have been necessary in order to implement a simulation of this situation and to eliminate spurious behaviours that otherwise would have been generated by QSIM. The simplifications are that the effort is wholly directed to either increasing size or decreasing C, and the state transitions occur when either the growth rate, dS/dt, or C are reduced to zero. This behaviour is shown in figure 6. It suggests the existence of periods of growth followed by periods of consolidation and clean-up. To aid the comparison of the qualitative results with empirical observations, figure 7 shows one possible quantitative instantiation of the evolving size shown in figure 6. Note that the decreasing time derivative of the size indicates some form of growth curve that approaches an asymptotic value. Figure 8 shows the growth trends over releases of four software systems. Some of their characteristics are presented in [Leh02b]. The size is measured in number of modules, an indicator of functional power that has been useful in previous studies [Leh85] and shown relative to size of first release with sufficient data to be considered in a previous study [Leh02b]. As can be appreciated, three of the patterns in figure 8 – that is, with exception of system “+” - resemble that of figure 7. System “+” would present a smaller growth rate and hence a pattern closer to that of figure 7 when only the size of the core of the system is being considered – see plot in [Leh02a]. The growth pattern of IBM OS/360 – 370, not presented in figure 8, is another exception to figure 7. Its growth pattern over releases was linear, with a superimposed ripple, up to releases 19-20. After that the growth pattern became oscillatory. This was due probably to excessive pressures for functional growth – positive feedback- that eventually led to its fission [Leh85]. The exponential growth trend of the Linux open source operating system is another exception that may be explained in terms of the proliferation of clones [Gdf00]. In these and possibly other anomalies observations suggest [Leh02b] that growth patterns with features of the pattern in figure 7 are more common than other patterns particularly in commercial, successfully evolving, systems.

Relative Size over Releases

s(i)/s(1) 5 4

Size

3 2 1

Time, Releases...

rsn i 0 1

Figure 7: One possible “quantitative” instantiation of "Size" as suggested in the “qualitative” simulation results of fig. 6

5

9

13

17

21

Figure 8: Growth patterns over releases for four systems. System “+” growth pattern is displayed over years instead of releases due to lack of release-based data [Leh02b].

Inflexion points are not reflected in the simple behaviours shown in figure 5. This suggests that a simple model, without transitions, is not fully adequate to describe the observations. In contrast, the behaviour

derived from the model with transitions, as shown in figures 6 and 7, more closely matches some of the empirical observations. In three of the growth trends displayed in figure 8, rapid growth of the software soon reaches a plateau where little progress is made. After some time – or releases - at this plateau stage, the growth curve bends upwards before reaching a new, higher plateau. In the QSIM model, this behaviour is simulated by the state transitions between growth and consolidation phases. Further work These preliminary results suggest that one possible explanation for the behavioral transition is a feedbacktriggered change in the predominant type of work, from growth dominated to complexity-control dominated and viceversa, over subsequent phases. Additional data is required to further validate this hypothesis and to compare it with alternative explanations for the observed rejuvenation transitions that have been proposed [Aoy02,Raj00,Tur02]. Additional empirical data and modelling work is required in order to compare the empirical support for the various hypotheses. The exploration of qualitative simulation, of which preliminary results have been presented here, can be extended by building QDE models that include attributes implied by laws in addition to those in figure 4. Initial efforts in this direction show that QSIM generates a number of behaviours that do not appear to have been empirically observed. Some of these behaviours may be spurious, and will either not emerge in a more refined model or may suggest refinements to the laws of software evolution. Other behaviours may be feasible but unlikely to occur in practice. Understanding why these behaviours do not frequently occur may give insight into how the software development process is managed in industry. In order to validate the QDE models one would wish to systematically compare the empirical data to the simulation outputs. This will be facilitated by techniques that extract the qualitative features of the data such as the ones discussed in [Tes98]. Conclusions QDEs provide facilitate simulation of behaviour implied by high level empirical generalizations such as the laws of software evolution. In this paper a preliminary QDE model, a possible instantiation of a sub-set of the laws of software evolution, has been used to simulate growth patterns of long lived software. The results are reasonably in line with some of the discontinuous growth patterns observed, providing an alternative hypothesis them in terms of transitions between growth dominated and complexity-control dominated phases. More generally, the research surrounding Lehman's laws can be seen as falling into one of two groups. The first line of research starts in empirical data and seeks to identify models and behavioural invariance that can be generalised. For example, one may ask how the empirical data can be systematically abstracted into generalisations or what is the degree of empirical support to any generalizations across different domains. This line of research includes testing the degree of empirical support for the laws [Ram02]. Another line of research takes the generalisations as starting point and seeks to derive practical results such as guidelines for management [Leh01]. Additional outcomes are simulation models aimed at increasing the understanding of the process and of its system dynamics and at exploring the impact of various process improvements [For61,Kel99,Leh02a]. QDEs can be of help in both lines of investigation by providing an intermediate level between high-level empirical generalisations such as the laws and conventional quantitative simulation. More generally, simulation of software processes based on quantitative techniques [Kel99] poses challenges when our knowledge of the process drivers is less than satisfactory and when the datasets are either too small or incomplete. Techniques such as qualitative simulation have the potential to provide meaningful results in thes software process domain. we believe that qualitative simulation has potential in the software process domain and is able to produce useful outputs as it has in other fields. References [Ant01] [Aoy02] [For61] [Gdf01] [Kel99] [Kui95] [Leh85] [Leh01]

Anton A and Potts C, Functional Paleontology: System Evolution as the User Sees It, Proc. ICSE 23, Toronto, 12-19 May 2001, pp. 421-430 Aoyama M. Metrics and Analysis of Software Architecture Evolution with Discontinuity. In Aoyama M et al (eds.), Proc. 5th Intl. Workshop on Principles of Software Evolution, IWPSE 2002, May 19 – 20, Orlando, FL., pp. 103 - 107 Forrester JW, Industrial Dynamics, Productivity Press, Cambridge, MA, 1961 Godfrey MW and Tu Q, Evolution in Open Source Software: A Case Study, Proc. Intl. Conf. on Software Maintenance, ICSM 2000, 11-14 Oct. 2000, San Jose, CA, pp. 131-142 Kellner MI, Madachy RJ and Raffo DM, Software Process Simulation Modelling: Why? What? How?, Journal of Systems and Software, Vol. 46, No. 2/3, April 1999, pp 91 -106 Kuipers B, Qualitative Reasoning - Modeling and Simulation with Incomplete Knowledge, MIT Press, 1995 Lehman MM and Belady L, Software Evolution - Processes of Software Change, Academic Press, 1985 Lehman MM and Ramil JF, Rules and Tools for Software Evolution Planning and Management, Annals of Software Eng., spec. iss. on Softw. Manag., vol. 11, issue 1, 2001, pp. 15-44

[Leh02a] Lehman MM, et al, Behavioural Modelling of Long-lived Evolution Processes-Some Issues and an Example, J. of Software Maintenance and Evolution: Research and Practice, Sp. Issue on Separation of Concerns, forthcoming [Leh02b] id, An Overview of Some Lessons Learnt in FEAST, this proceedings [Mdh02] Madhavji NH, Introduction to the Panel Session: Lehman’s Laws of Software Evolution, in Context. Proc. ICSM 2002, 3-7 October, Montreal, forthcoming [Raj00] Rajlich VT, Bennett KH, A Staged Model for the Software Life Cycle, Computer, July 2000, pp. 66 – 71 [Ram02] Ramil JF, Laws of Software Evolution and Their Empirical Support, panel statement, Proc. ICSM 2002, 3-7 Oct., Montreal, forthcoming [Rio77] Riordan JS, An Evolution Dynamics Model of Software Systems Development, in Software Phenomenology - Working Papers of the (First) SLCM Workshop, Airlie, Virginia, Aug 1977. Pub ISRAD/AIRMICS, Comp. Sys. Comm. US Army, Fort Belvoir VI, Dec 1977, pp 339 - 360 [Som92] Sommerville I, Software Engineering, 4th Edition, Addison-Wesley, Wokingham, UK, 1992, pp. 536 - 538 [Sua02] Suarez AJ et al, Qualitative Simulation of Human Resources Subsystem in Software Development Projects, QR2002 6th Intl. Workshop on Qualitative Reasoning, Sitges, Spain, June 10-12, 2002, pp. 169 – 176 [Tes98] Tesoreiro R and Zelkowitz MA, Model of Noisy Software Engineering Data, 20th ICSE, Kyoto, Japan, April 19-25, 1998, pp. 461 - 476 [Tur02] Turski WM, A Simple Model of Software System Evolutionary Growth, in Madhavji NH, Lehman MM, Ramil JF and Perry D (eds.), Software Evolution and Feedback: Observations, Theory and Practice, forthcoming.

Appendix Lehman's Laws of Software Evolution as stated in [Leh01]. No. I 1974 II 1974 III 1974 IV 1978 V 1978 VI 1991 VII 1996

Brief Name Continuing Change

Law An E-type system must be continually adapted else it becomes progressively less satisfactory in use Increasing Complexity As an E-type system is evolved its complexity increases unless work is done to maintain or reduce it Self Regulation Global E-type system evolution processes are self-regulating

Conservation of Organisational Stability Conservation of Familiarity Continuing Growth Declining Quality

VIII Feedback System 1996 (Recognised 1971, formulated 1996)

Average activity rate in an E-type process tends to remain constant over system lifetime or segments of that lifetime In general, the average incremental growth (growth rate trend) of E-type systems tends to decline The functional capability of E-type systems must be continually increased to maintain user satisfaction over the system lifetime Unless rigorously adapted to take into account changes in the operational environment, the quality of an E-type system will appear to decline as it is evolved E-type evolution processes are multi-level, multi-loop, multi-agent feedback systems

Suggest Documents