Dimensions of Software Architecture for Program Understanding Hausi A. Muller
Kenny Wong
Scott R. Tilley
[email protected]
[email protected]
[email protected]
Department of
Computer Science University of Victoria Victoria, BC V8W 3P6
Abstract Software architecture is usually considered in terms of software construction rather than software understanding. Architectures for construction typically embody design patterns based on software engineering principles. In contrast, architectures for understanding represent change patterns and business rules based on conceptual models. This paper presents three dimensions of software architecture for program understanding. In each dimension, the user of the architecture plays a central role.
Keywords: program understanding, software architecture, software evolution.
1 Introduction Software architecture has traditionally been oriented toward new software construction, not toward software understanding for maintenance and evolution. This orientation has lead to a view of software architecture as consisting of design patterns, reusable components, and structural dependencies. In this view, architecture is a framework This work was supported in part by the British Columbia Advanced Systems Institute, the IBM Software Solutions Toronto Laboratory Centre for Advanced Studies, the IRIS Federal Centres of Excellence, the Natural Sciences and Engineering Research Council of Canada, and the University of Victoria. y
y
Software Engineering Institute
Carnegie Mellon University Pittsburgh, PA 15213-3890
for tracking requirements, a technical basis for design, a managerial basis for cost estimation and process management, an eective basis for reuse, and a basis for dependency and consistency analysis [1]. While helpful, this view ignores the importance of a multitude of architectures based on change patterns, business rules, or conceptual models used during program understanding. The users and uses are dierent. Architectures for software design and construction cater to designers, integrators, testers, and inspectors; architectures for software understanding cater to software engineers who need to understand software systems during long-term maintenance. For software understanding especially, we argue for an alternative, yet compatible, interpretation of architecture that places the individual user in a central role. The need to consider individual users is partly evidenced by the following fact. Software exists as a hybrid of two imperatives: intangibility and tangibility [2]. The intangibility imperative views software as an abstract idea similar to mathematics. The tangibility imperative concerns more immediate, accessible, and physical concepts. The main observation is that intangibles are best addressed and understood at the level of individuals, not committees. Thus, a comprehensive treatment of understanding software requires support not only for groups, but for individual users.
2 A user-oriented interpretation
impact on the design of tools and methodologies for program understanding. Many program understanding tools deal with the software as if it has a single architecture, such as module interconnections, that is right for everyone. This ignores the fact that users each have their own conceptual models, methods of reasoning, and needs for understanding. Diverse needs such as business rules and change patterns are lost in a module interconnection architecture. Annotating a module diagram with this information is inadequate, cluttering, and relegates the information as second class to some users. It is irrelevant and to a user whether this information is collected into a grand, uni ed architecture. What is needed is tool support for expressing and analyzing a tapestry of multiple, user-oriented, architectures for understanding.
For software understanding, the notion that there is an architecture existing in an objective state, independent of the engineers probing it, is awed. We believe there is no single, de nitive architecture of a software system. Instead, there are many architectures corresponding to many diverse purposes and users. The single-architecture view is limited and ignores levels of user involvement, thereby lacking context. In eect, architectures are user-created and depend in part on how we observe it and what we choose to see. In addition, there is no right architecture for a software system that is valid for all users. Each user has a dierent interpretation of what the software is about. These are not mere subjective and virtual \perceptions" of a single, true architecture; an architecture to a user is very real and meaningful. Individual maintainers, managers, and customers have their own architectures.
Unlike architectures for construction, which typically deal with exact information for purposes such as code generation, debugging, and integration, architectures for understanding must allow and deal with inexact or partial information. The art of eective understanding is to know what to leave out and what to ignore [4]. Incompleteness is the norm, not only for practical reasons of the scalability and immaturity of the analysis methods, but because there cannot be a de nitive architecture in an evolving system. Even inconsistency and con icts from trying to consolidate two architectures are acceptable, since humans can easily deal with incomplete and inconsistent information. This differs from the speci cations and design architectures used by computers, which need to be exact, complete, and consistent.
An architecture for understanding may span various levels ranging from the concrete to the abstract: implementation, structural, functional, and behavioral [3]. For one user, an architecture may contain a combination of detailed code, structural design patterns, object interaction behavior, and functional purpose. For another user, an architecture may contain a combination of entirely dierent information: code complexity measures, maintenance eort, risk analyses, and personnel assignments. There is no con ict in having multiple architectures for understanding. Moreover, these architectures are continually changing and in ux, not only because the software is evolving, but also because the users and their needs are changing. Program understanding necessitates a subtle appreciation of these rich and diverse architectures.
3 The role of the user Since architectures for understanding are fundamentally user-created entities, it is logical that program understanding tools and methodologies in-
The idea of multiple architectures has a major 2
volve the user in the understanding process. There are problems with tools that automatically create some standard \architectural" diagram of a software system without user input. Typically, these diagrams do not conform precisely to a user's needs and lack critical domain-speci c information.
Conceptual
Analysis level
Textual
More importantly, there is no rationale for why the diagram is the way it is. The answer is not one of automatically producing more and dierent kinds of diagrams. Too much emphasis is currently being placed on fully automatic tools. Any understanding method or tool must support dierent users and mental models, be extensible and tailorable, and be applicable to multiple domains. Our solution focuses on user involvement [5].
Domain
Specific
Automatic
Manual
Automation level
Retargetability General
Figure 1: Architecture dimensions needed at higher levels of abstraction, at increased use of application-domain knowledge, and at decreased degree of automation.
While comprehending a software system, it should be possible to include human input and expertise in the decision making. There is a tradeo between what can be automated and what should or must be left to humans; the best solution lies in a combination of the two [6]. Hence, the process of understanding should be manual, semi-automatic, or fully automatic where applicable. Through usercontrol, this process can be based on diverse contextual knowledge such as business policies, tax laws, or other high-level, semantic information not directly present in the source code.
The origin suggests an architecture for understanding such as a simple owchart: code-level abstraction, no application-domain knowledge, and completely automatic. We believe architectures (and their support tools) near the origin are less meaningful for understanding software. Comprehension is not signi cantly better than reading a code listing. It might be said that meaning is largely in the user's head and not in the source code. Typically, there is too much information generated by automatic tools. This concern is especially critical for understanding large, industrial legacy software systems [7].
4 Architecture dimensions
These dimensions form a design space for program understanding tools needed to exploit user input and domain-speci c knowledge. Examples of typical domains are programming language, implementation technique, nancial information, and management style. The incorporation of this domain knowledge helps to complete a user's architecture.
We identify three dimensions of architectures for understanding that aect user involvement: the level of abstraction addressed, the degree of application-domain knowledge used, and the degree of automation used (as shown in Figure 1). The orientation of the axes is particularly important. User involvement is minimal at the origin of this coordinate system and increases in the directions of its three axes. More user involvement is 3
5 Conclusions [4]
Software architectures for construction and understanding cater to dierent audiences and are therefore inherently dierent. However, the interpretation of multiple, user-oriented architectures is equally applicable in both domains. From the perspective of software evolution, more eort is needed to address software understanding concerns during the design and construction phases of software systems.
[5]
[6]
Users need to interactively explore and reorganize discovered information. They need to be empowered to express and customize their own architectures. And they need tools to support semiautomatic methods to help ease the manual and more tedious parts of understanding.
[7]
During system construction, requirements and business rules are readily available. Thus, it would be easier for example to build and record hierarchical decompositions based on business rules during the design stage than during long-term maintenance when the unearthing of business rules is a slow and tedious process. Ideally, design and understanding architectures would be built concurrently during the early phases of the software life cycle and evolve concurrently during the later stages.
References [1] D. E. Perry and A. L. Wolf. Foundations for the study of software architecture. ACM SIGSOFT Software Engineering Notes, 17(4):40{52, October 1992. [2] B. J. Cox. Planning the software industrial revolution. IEEE Software, 7(6):25{33, November 1990. [3] J. Q. Ning. A Knowledge-Based Approach to Automatic Program Analysis. PhD thesis, Department of
4
Computer Science, University of Illinois at UrbanaChampaign, 1989. M. Shaw. Larger scale systems require higher-level abstractions. ACM SIGSOFT Software Engineering Notes, 14(3):143{146, May 1989. Proceedings of the Fifth International Workshop on Software Speci cation and Design. S. R. Tilley. Domain-Retargetable Reverse Engineering. PhD thesis, Department of Computer Science, University of Victoria, January 1995. Available as technical report DCS-234-IR. E. Buss, R. D. Mori, W. M. Gentleman, J. Henshaw, H. Johnson, K. Kontogiannis, E. Merlo, H. A. Muller, J. Mylopoulos, S. Paul, A. Prakash, M. Stanley, S. R. Tilley, J. Troster, and K. Wong. Investigating reverse engineering technologies for the CAS program understanding project. IBM Systems Journal, 33(3):477{500, 1994. K. Wong, S. R. Tilley, H. A. Muller, and M.-A. D. Storey. Structural redocumentation: A case study. IEEE Software, 12(1):46{54, January 1995.