Proceedings of the 10th Int’l Software Process Workshop (ISPW-10). Ventron. France. June 17-19. 1996.
Considerations for an Organizational Memory for Software Development Ernst Ellmer (1), Dieter Merkl (2) (1)
University of Vienna, Institute of Applied Computer Science Liebiggasse 4/3-4, A-1010 Vienna, Austria
[email protected]
(2) Vienna
University of Technology, Institute of Software Technology Resselgasse 3, A-1040 Vienna, Austria
[email protected]
0. Preliminaries “... organizational memory is organizational knowledge with persistence.” [1] “... process knowledge is a valuable commodity and ought to be preserved and passed on. Materializing it is a critical necessity.” [10] “... effective software process descriptions are one of the most valuable resources we as a society have.” [10]
1. The Position Based on the above preliminaries, we believe that the efficiency and productivity of software development processes can be dramatically increased by making the knowledge gained during past projects persistent and thus reusable for future projects. We plead for the establishment of a computer supported organizational memory for software development organizations. Furthermore, we feel that the accelerated growth of software process modeling techniques provides a convenient equipment for the realization of what we proposed above. A process model is an explicit representation of process knowledge and may thus serve as a means for storing and retrieving organizational knowledge about software process execution. Thus, we argue for reusing software process models. An organizational memory for software development organizations may be implemented by populating and structuring a process model library and providing mechanisms for retrieving and tailoring process models in order to apply them during the execution (management) of upcoming projects.
2. The Motivation A number of arguments speak for the reuse of software process models as a kind of organizational learning for software development organizations. First of all, a library of reusable software process models represents a substantial amount of knowledge acquired by a
number of software engineers during their lifetime spent with the development of software. If this knowledge is not stored for future reuse, it gets lost and, as a consequence, future projects are not able to benefit from this knowledge. Provocatively speaking, “one of the most valuable resources we as a society have” [10] gets lost. Second, reuse of process models constitutes a kind of computer support for the learning of organizations about their software development processes and makes the undertaking less dependent on single persons and special skills. Thus a process model library as an integrated part of a CASE environment may serve as a cooperate repository of software development knowledge. Third, a substantial amount of time can be saved by reusing (parts of) software processes or process models. The development of a software process model is a time consuming and complex task as several studies demonstrate (see e.g. [5]). By reusing process knowledge, time and costs of software development can be reduced, and thus productivity can be increased. Fourth, reused process models are - due to their earlier application within other software projects - tested and proven to be prepared to guide and/or enforce a software process. The reuse of process models thus contributes to improved quality of software products by raising the quality of the development process. Fifth, reusing (parts of) process models may be a first step towards a standard software process model as proposed by several software process improvement approaches like the Capability Maturity Model (CMM) [11]. Such a standard process model - organization process definition in terms of the CMM - represents the core of each model and is tailored to the specific needs of a certain project by process model customization and evolution. Sixth, we see a special benefit of process model reuse for the development of software product lines. The similarity of the products to be developed results in the similarity of the process models guiding the development. Thus, there is an increased potential for the reuse of process models.
3. The Realization Only recently we suggested an approach to process model reuse and carried out a case study on populating and structuring a process model library [7]. The basis of our approach to software process reuse are recent developments in process modeling [2, 6], and an approach to process description already proposed in literature [3]. Our approach meets the process model reuse requirements as identified by [12]. For classification purposes we use artificial neural network technology proven to be useful to solve similar problems, namely to classify software products for reuse [9]. In the spirit of [4] we divide the reuse process into three major phases. In a first step, classification, reusable process models are collected and classified in order to serve as the basis of future projects. Next, in a retrieval phase, reuse candidates for a given software development problem are identified and retrieved. Finally, in a tailoring phase, the retrieved reuse candidates are evaluated and adapted to the specific engineering situation. 3.1 Process Model Reuse The baseline of our approach is marked by a process description (PD) consisting of two parts, namely a descriptive process definition document (PDD) and a formal process model (PM) as proposed by [3]. The PDD is a natural language document explaining the process and covering a large amount of semantics. Furthermore, the division of PD in PDD and PM makes the approach independent of any process modeling formalism or process programming language because the classification is based on the PDD rather than on the PM. This division also enables our approach to meet requirement 1 of [12], namely the description of reusable process building blocks in terms of goals and implementation. On the one hand, the process definition document provides a goal-oriented view on the process chunk. The process model considers its implementation aspects, on the other hand. This mechanism also takes into account recent development in process modeling pleading for a goal-oriented approach (see e.g. [2]). Requirements 2 through 5 of [12] state that there have to be two types of process building blocks, i.e. modules and views, describing all aspects of a process or only a certain view on the process respectively. Our approach enables the population of a process model library with both modules and views at the same time. The key to this feature is once again the distinction between PDD and PM. The views concept only capturing a certain perspective on a whole process is also proposed by [6] as a general process modeling paradigm. In a sixth requirement, [12] points out that process models have to be refined/aggregated as well as generalized/ specialized. In our approach, process building blocks for reuse can be represented at different levels of abstraction and each of these levels may be stored independently in the library.
3.2 Classification As already mentioned, we use the natural language process definition document (PDD) for reuse purposes. Our approach relies on automatically extracted keywords from the documents. These keywords are further used as the input data to an artificial neural network which preforms the task of classification. Specifically, each document is described by using a set of keywords extracted from the full-text of the PDD. Subsequently, each document is represented as a binary-valued vector where each component corresponds to a possible document feature, i.e. keyword. Thus, an entry of zero denotes the fact that the corresponding feature is not used in that particular PDD. Contrary to that, an entry of one means that the corresponding feature is used in the PDD. The ultimate goal of document classification is to uncover their semantic similarities. In terms of our application domain, we are interested in a grouping of closely related software process models or more specifically, a grouping of process definition documents that describe related processes. We used this classification process to structure a library of software process models. The results are described in full detail in [7]. We designed this case study in such a way to demonstrate a wide spectrum of the features of our approach. In particular, we used a set of 38 process building blocks to represent an experimental process model library. We populated the library with both modules and views in the sense of [12]. Moreover, we used process models at highly different levels of abstraction. The spectrum of abstractions starts with life-cycle models and ends with well defined subprocesses of the ISPW-6 example [8]. Furthermore, examples of refine/aggregate relationships between process models are used. To summarize our highly encouraging results, the artificial neural network proved to be successful in uncovering the semantic similarities between the various process models and structures the experimental process model library accordingly. 3.3 Retrieval The objective of the retrieval phase is to find candidate models for reuse within a future project. The first step is to formulate a query. Following the approach presented in this paper, the query has to be formulated in natural language. The more detailed the query, the higher the possibility to find a detailed reusable process model. But nevertheless, also a vague description of the future project may lead to the presentation of reuse candidates, yet at a high level of abstraction. The technique we apply for retrieval is the same as the one used for classification. The natural language query is transformed to a vector representation by automatic keyword extraction from the full-text of the query. Subsequently, the query vector is compared with the process description (PD) vectors of the process model library, and the most similar ones are presented to the process engineer as reuse candidates.
3.4 Tailoring The goal of the final tailoring phase as the next step in process model reuse is to generate a process model fine-tuned to the special needs of the concrete project at hand. In our point of view, tailoring cannot be fully automated by a computerized tool, but the process engineer can rely on the knowledge from past projects provided by the retrieval phase as described above. According to our approach the process engineer has two alternatives. First, he/she can use a model at a high level of abstraction and refine it to meet the specific needs of a concrete project in a top-down manner. Second, he/ she can use low-level process chunks and assemble them in a bottom-up fashion to reach a suitable process model. Independent of the way he/she chooses to create a process model for a project, the process engineer has to enrich “old” knowledge represented by reuse candidates from a process model library with “new” knowledge about the current project like methodologies to be used or specific requirements of the application domain. This is due to the fact that in software engineering usually no two projects are the same in each and every respect and thus, reused knowledge has to be adapted to the characteristics of the project at hand.
Are artificial neural networks the best choice for classifying process models for reuse? The technology we use for classification is proven to be useful for classifying software products for reuse [9] and yields better results than for example cluster analysis. Further experiments are needed to determine the optimal configuration and parameters of the neural net we employ.
4. Discussion
References
In this section we raise some important questions concerning the reuse of process models and its potential for establishing an organizational memory generally and our approach specifically. Is process model reuse yet practicable? Software process technology is a relatively new discipline and in order to provide process models of sufficient quality for reuse, its approaches have to be experienced within real world software development environments. Is process model reuse enough for implementing an organizational memory? Process models as we know them from software process technology only cover a part of the knowledge about the performance of software processes. Thus we feel that beyond the information about the behavior (processes) there is also need for knowledge about the static structure (agents, teams) within which processes are performed. This information should also be part of an organizational memory. Can we learn from product reuse? There is an undeniable similarity between software processes and software products [10] leading to the conclusion that well known approaches from product reuse may be adapted and brought into action in process reuse. How to ensure PDD-PM consistency? Our approach to process model reuse is based on a natural language description of the process (PDD) rather than on its formal model (PM). If they are not consistent, reuse candidates probably are not appropriate. However, we feel that this is a research problem in its own right and thus not exclusively brought about by our approach.
5. Conclusion In this paper we argued in favor of the establishment of a reuse culture in software process modeling. We pointed out the benefits of such a reuse culture which most prominently may be related to the development of an organizational memory. In order to enable the reuse of process models we identified classification, retrieval, and tailoring as the three phases of any meaningful attempt to provide assistance to the actual process engineer. We presented our own approach relying on the natural language explanation contained in the process description document as a step in this direction. The results from a case study with highly divergent process models are especially encouraging and should provide the basis for further research.
[1] Ackermann, M. S, “Augmenting the Organizational Memory: A Field Study of Answer Garden”, Proc. ACM Conf. on Computer Supported Cooperative Work, Chapel Hill, October, 1994. [2] Arbaoui, S, Oquendo, R., “Goal Oriented vs. Activity Oriented Process Modelling and Enactment: Issues and Perspectives”, Proc. European Workshop on Software Process Technology 1994, Springer LNCS 772, pp. 170-176. [3] Armitage, J. W., Kellner, M. I., “A Conceptual Schema for Process Definitions and Models”, Proc. 3rd Int. Conf. on the Software Process (ICSP3), 1994, IEEE CS Press. [4] Basili, V. R., Rombach, H. D., “Support for Comprehensive Reuse”, Software Engineering Journal, September 1991. [5] Barghouti, N. S., Rosenblum, D. S., “A Case Study in Modeling a Human-Intensive, Corporate Software Process”, Proc. 3rd Int. Conf. on the Software Process (ICSP-3), 1994, IEEE CS Press. [6] Conradi, R., Liu, C., “Process Modelling Languages: One or Many?”, Proc. 4th European Workshop on Software Process Technology (EWSPT-4), Noordwijkerhout, The Netherlands, April 1995, Springer LNCS 913. [7] Ellmer, E., Merkl, D., “Classifying Process Models for Reuse”, University of Vienna, Technical Report, December 1995. [8] Kellner, M., Feiler, P., Finkelstein, A., Katayama, T., Osterweil, L., Penedo, M., Rombach, H.D., “ISPW-6 Software Process Example”, Proc. 1st Int. Conf. on Software Process, Washington, 1991, IEEE CS Press. [9] Merkl, D., Tjoa, A M., Kappel, G., “Learning the Semantic Similarity of Reusable Software Components”, Proc. 3rd Int. Conf. on Software Reuse, Rio de Janeiro, Brazil, November 1994, IEEE CS Press. [10] Osterweil, L., “Software Processes are Software Too”, Proc. 9th Int. Conf. on Software Engineering, Monterey, California, 1987, IEEE CS Press. [11] Paulk, M. C., Curtis, B., Chrissis, M. B., Weber C. V., “Capability Maturity Model for Software, Version 1.1”, Technical Report CMU/SEI93-TR-24, Carnegie Mellon University, February 1993. [12] Rombach, H. D., “Modularizing Software Process Models for Reuse”, Proc. 7th Int. Software Process Workshop, Wadern, Germany, March 1993, IEEE CS Press.