Pattern-Driven Reverse Engineering - CiteSeerX

5 downloads 21393 Views 50KB Size Report
recovery will not only speed up the process; it will also in- crease the quality of the .... The manifestation of such solution patterns is hard to describe on a general ...
in:

Seventh International Conference on Software Engineering and Knowledge Engineering (SEKE ’95) Rockville, Maryland, June, 1995

Pattern-Driven Reverse Engineering Harald C. Gall, Ren´e R. Kl¨osch

Roland T. Mittermeir

Vienna University of Technology

University Klagenfurt

Abstract

grams, informal concepts, module structures, data and control flow. Programming languages do not include the constructs necessary to express information such as the informal conceptual abstractions behind the source code mapping it to patterns known from previous experience. A human expert interprets code segments in informal terms. An automated system intended to support this process needs access to the same kind of “in-head” expertise, such as a knowledge base or an application domain model. Current approaches are not able to help the analyst search through code to find structures and patterns of interest. In this sense we do not think of low level search tools, such as the UNIX command grep, etc., but of sophisticated program understanding or redocumentation tools which are often part of comprehensive reverse engineering tool-sets. One of the key problems a design recovery tool has to deal with is how to enable the analyst to keep track of the relationship between the various (usually high-level) abstractions and the segments of code that implement them [15]. In this paper, we point out how to extend our COREM reverse engineering approach using patterns on several levels of abstraction. We describe which kinds of patterns can be identified, how they influence the reverse engineering process, and why they significantly improve the program understanding process.

A fundamental weakness of conventional reverse engineering approaches is the lack of support in identifying program structures and recurring patterns. The integration of human domain knowledge represented via patterns can significantly improve design recovery results. The recognition of a program’s design leads to recurring patterns that, at present, have to be identified in demanding tasks by a human engineer without automated assistance. In this paper we present extensions to our reverse engineering approach based on various design recovery patterns that can be used in further automating such demanding tasks. The design recovery patterns are defined according to several factors we consider to be essential in the reverse engineering process. Based on these influence factors we show the integration of such patterns in our reverse engineering approach and define the notion of pattern-driven reverse engineering. The implications of pattern-driven reverse engineering and the improvements to be achieved for program understanding are discussed for each kind of patterns in turn.

1 Introduction Due to the growing number of legacy systems in industry that have to be maintained, reverse engineering has become an emerging research area during the past few years. But most conventional reverse engineering approaches from an insufficient degree of automation, and, hence, are lacking in an important aspect of industrial relevance. Examining the results of various reverse engineering activities leads to similar recurring patterns on different levels of abstraction. Integrating patterns within the reverse engineering process seems to be a promising approach to overcome the current situation by increasing the degree of automatability of the process. Design recovery as a special aspect of reverse engineering supports program understanding answering fundamental questions, such as: What are the modules? What are the key data items? What are the software engineering artefacts (e.g. program description language, dataflow, data dictionary, etc.)? During the design recovery process, the analyst tries to recover various useful abstractions, e.g. informal dia-

2 Related Work Work on program understanding for software maintenance goes back to the 80s [5, 9, 16] and much effort has been invested in this area. Systems like [10] or [12] aim at automated support to reengineer legacy systems, especially large systems implemented in COBOL. As many COBOL based approaches, they mainly focus on re-modularization and restructuring for improving the quality of COBOL code. Research has shown that the reverse generation of design abstractions from given source code needs additional knowledge about the application and its domain. This knowledge has to be represented in a way usable for a matching to the source code in order to identify these design elements. Several research approaches deal with this problem [2, 3]. 1

Contrary to many other approaches, Desire [1] explicitly considers informal information that is incorporated in the source code and uses domain knowledge providing clues to understand the formal source code structures. It represents this knowledge in a domain model that consists of program patterns, domain patterns, programming language patterns, and naming conventions. The elements of this domain model are matched against segments of the source code for deriving design information represented in the program. Thus, the domain model acts as a knowledge base of problem oriented and of solution oriented expectations which provide frameworks for the interpretation of the code. Emphazising the informal knowledge beyond pure source code, the Desire approach is an important step towards overcoming the typical limits of previous reverse engineering approaches. Although the domain model offers informal information, this approach suffers from the problem of bridging the wide gap between the very abstract and high-level informal domain concepts and the specific low-level source code. In Recognizer [14] frequently used data structures (e.g., sorted lists or hash-tables) and algorithms (e.g., binary search) are represented in so-called clich´es. This tool automatically finds occurrences of a given set of clich´es in Common Lisp programs and builds a hierarchical description of the program in terms of the clich´es it finds. Syntactical variations of code or overlapping implementation parts limit the applicability of such methods to small non-commercial programs with low complexity though. The genericity of such clich´es and hence the lack of correspondence to domain concepts restrict them to abstract, application-independent, language-specific reverse engineering tasks. A knowledge-based approach for program understanding was developed by Das [4] for maintaining and modifying source code. A set of rules provides a reasoning framework that helps detecting several classes of bugs. Program plans representing the logical structure of the input program are derived from the source code and can be used for an interpretation to check the consistency between the logical structure and the corresponding source code. PAT [8] uses an object-oriented framework to represent programming concepts and a heuristic-based conceptrecognition mechanism to derive high-level concepts from the source code using program plans. PAT supports in identifying abstract concepts in the source code, and the corresponding implementation parts. Although program plans seem promising, such approaches still cover only simple mathematical computations, sorting, and/or searching routines. Paul and Prakash [13] defined a pattern language to specify source code patterns being an extension to a source code language, but like many other approaches it lacks to establish language constructs for semantic equivalence.

This approach mainly concerns low level source code segments and does not consider any domain concepts. Next we present our approach of pattern-driven reverse engineering. The use of patterns for design recovery activities is an extension to our object-oriented reverse engineering approach COREM, already discussed in [6, 11]. In Section 3 we introduce the factors influencing software development as a basis for showing how to define various kinds of patterns for extending our approach. In the sequel, we describe each kind of patterns in detail and classify them according to different criteria. Section 4 summarizes the improvements that can be achieved in applying pattern-driven reverse engineering.

3 Pattern-Driven Reverse Engineering 3.1 Factors Influencing Software Development For defining patterns relevant for reverse engineering, it is necessary to identify the various factors relevant during regular (i.e. forward) software development. As shown in Figure 1 we have identified two major sources of interest influencing the development of a software system and hence its source code: the customer and the developer responsible for the production of the desired system. The problem the customer needs to be solved is influenced by various objective and subjective aspects. Such objective aspects are the requirements explicitly expressed by the client and, on a higher level, needs derived from the application domain. A second objective aspect are the business rules of the particular client-company. Subjective aspects are mainly introduced by the users of the new system who are involved in the requirements analysis process. While the customer aspects mentioned will result in problem patterns to be found in the resulting code, the solution patterns stemming from the specific way software development is carried out by the producer’s company likewise influence the resulting software system in many respects. The software development guidelines constituting the basis for the work of a software development company determine the various tools which are used during the software engineering process. They in turn have substantial influence on the actual code produced. The software development guidelines affect the resulting product not only via the tools but also via their general development guidelines (e.g. applied methodology or quality assurance activities) and not directly visible habits within the development teams. Furthermore, already existing solutions, e.g. in form of software architectures or libraries of reusable assets, restrict the variety of possibilities. All these factors directly or indirectly influence the resulting code to a varying degree. To have them explicitly at hand when examining a given program in case of design 2

customer sw development company

companies’ business rules

application domain

reusable assets

PROBLEM software architectures

sw development guidelines

requirements analysis tools

design tools CODE implementation tools

Figure 1: Factors influencing the software development process recovery will not only speed up the process; it will also increase the quality of the resulting product, since knowledge of these patterns will allow to capture certain development rationales, which were finally burried in the details of code.

3.2

generated RooAM are mapped to a fowardly generated FooAM that was built independently of the design recovery process. (see Fig. 2). This FooAM represents the knowledge about the application and its domain and supports in identifying application-semantic data structures to become objects. In this paper, we will show, how this traditional approach can be supported by shifting certain stereotypical tasks from the human expert to an automated tool. These stereotypical tasks will mainly concern the recognition of patterns which will be constantly recurring within a given reverse engineering venture. On the basis of these considerations, the COREMprocess is supported by considering explicitly various kinds of patterns relevant for a proper interpretation of the code available, for an easier construction of the forwardly generated/developed application model (FooAM), and for the mapping between the forwardly and the reversely generated application model. The program representations mentioned above (e.g. ERD, RooAM, and FooAM) allow us in the first step to define the following basic kinds of patterns (see Figure 2):

Traditional Basis

In [6, 11] we defined an object-oriented reverse engineering method called COREM that supports to change the architecture of procedural legacy systems to a modern object-oriented architecture. To recover objects with rich application-semantics (and not just data stores) a human engineer is integrated into the COREM process. This human intervention is needed for introducing application domain knowledge that allows us to resolve conflicts usually occurring with traditional approaches. In order to make the architectural change work, several program representations are reversely generated out of the procedural source code to stepwise increase the level of abstraction: we start with generating structure charts (SC), use them to derive dataflow diagrams (DFDs) and furthermore an entity-relationship diagram (ERD) of the examined program. The ERD constitutes a basis for deriving an reversely generated object-oriented application model (RooAM) to define objects. The results of this reversely

 language patterns (source code level) 3

2. solution-oriented – developer driven, application model FooAM

human engineer

Mapping application pattern knowledge base

Figure 3 summarizes these considerations on the ground of the base factors described in Section 3.1. Next we will discuss the various kinds of patterns in detail. After that we will show how these patterns fit into the categorization described above.

application model RooAM

design pattern

3. creative – stemming from the creative act of applying the developers solution methodology and techniques to the customers problem.

design SC, DFD, ERD language pattern

COREM source code

Design Recovery

Application patterns are recurring high level patterns within an application or within a specific instantiation of an application domain. Application patterns would be recurring patterns, possibly saved from previous reverse engineering efforts in this application area or within this company. Hence, they stem rather from the application domain than from the specific application at hand. But they are instantiated chunks of domain knowledge, instantiated for the specific context in which the given reverse engineering effort takes place. To identify and pin down application patterns, we stratify the customers problem space into four levels (see Figure 3):

Figure 2: COREM design recovery and basic patterns

 design patterns (design level)  application patterns (application and application domain level) Based on the factors influencing the software engineering process, we define additional kinds of patterns:

 general domain model (GDM),

 software solution patterns

 customer model (CuM),

 architectural patterns

 application domain model (ADM), and finally

We will discuss these patterns in turn in section 3.3.

3.3

3.3.1 Application patterns

 application model (AM)

Design Recovery Patterns

These strata successively generalize from the original problem and result from an interaction of the generalization of the nominal problem to be solved and of the characteristics of the specific customer for whom this solution has to take place. To highlight the difference between these models, consider the development of an invoicing system for ABC-Inc. The invoicing system will first of all exhibit certain properties and key architectural concerns which will invariably appear in any invoicing system. These can be found and matched against the GDM for invoicing systems. The fact that ABC-Inc. is active in English and Spanish speaking countries, has about 10.000 clients worldwide and a host of other peculiarites, amongst them that invoices should leave the company never before the merchandize left but not later than one day after it left place special constraints on the solution. They are captured in the customer model (CuM). Further, we know that ABC-Inc. is a publishing house and publishing houses have certain peculiarities in their invoicing system, which would be quite different from those of retailers or power supply companies. The interaction of these peculiarities from the field of action of the client and

An approach to successfully define and use patterns for reverse engineering activities should meet the following criteria:

 support the generation and use of patterns on different levels of abstraction  distinguish problem- and solution-oriented patterns  provide patterns that are generic to some degree: – vertically generic patterns focus on one specific application domain – horizontally generic patterns focus on many application domains and are, therefore, more general Based on these criteria the kind of patterns identified above are categorized into three different areas of influence: 1. problem-oriented – customer driven, 4

general domain model

creative space

application domain model SW-development guidelines

application model SW-solution pattern

application patterns

architectural pattern

customer model

problem space

design pattern

coding conventions solution space language pattern

CODE

Figure 3: Design Recovery Patterns tion.

the kind of problem to be solved are captured in the application domain model (ADM). This application domain model is finally enriched by company-specific structures, thus leading to the model of the specific application at hand (AM).

The manifestation of such solution patterns is hard to describe on a general level. To a large extent, they might be influenced by rules and regulations pertinent to a certain application domain. To another extent, they might be influenced by a certain corporate culture of the developers organization as well as of the customers organization. (E.g. the comprehensive set of products for an invoicing system will be different, if the system is developed for/by a government agency or for/by a small start-up company; the system will be different according to the maturity level of the developing organization; the system will be different if a document driven or a prototyping approach has been followed.)

3.3.2 Software solution patterns Software solution patterns and architectural patterns constitute the core part of the creative intersection between the problem and its solution. With software solution patterns, we refer specifically to the following aspects: software development paradigms, techniques and models, software engineering methodologies (prototyping, waterfall model, fountain model, etc.), quality assurance concepts (e.g. ISO 9000, CMM, etc.), and specific software development guidelines. At first glance, one would consider these solution patterns to be in the domain of the software producer. However, the strategy of attack they are constituting might well be determined to a large extent by certain ”environmental” aspects of the client which are to be taken into considera-

While such considerations will be visible in the source code constituting the basis for a reverse engineering venture, formalizing them is quite difficult. As of current, we provide for this slot. It is to be filled in on the basis of further research and empirical evidence. 5

a learning part of the system, which gets initialized by the developers naming conventions and has to “learn” during the reverse engineering process, how these naming conventions are applied by the developer(s) within the given development-/maintenance-task on the software currently reverse engineered. These low level considerations are extended to general patterns for data definition, item initialization or the data model per se. Here, developer characteristics and tool characteristics start to overlap. E.g. the specific form of module decomposition, of task management, and of dataand control flow are not only dependent on the methodology followed but also on the specific tool-set used by the developers organization. On the basis of naming conventions, language patterns (discussed below), usage- and communication patterns and certain tool- or methodologyspecific stereotypes, generic high level patterns can be defined which will allow to formulate hypothesis against which the actual code can be held, thus supporting program understanding. The HCI-bullet above constitutes a special entry in this respect. Depending on the tools used and on the neatness and uniformity of the man/machine interfaces provided, very powerful hypothesis (patterns) can be formulated against which the actual code is to be held, or little is to be achieved.

3.3.3 Architectural patterns While the software development paradigms concern rather processual information relevant for software development, architectural concerns directly determine the solution. To give examples of such architectural concerns we mention the following: basic communication concepts (producer/consumer, remote procedure calls, client/server, etc.), domain independent standard architectures (CORBA, DCE, ANSA, ODP, etc.), or domain dependent standard architectures (e.g. insurance application architecure, IAA). In this section we refer to the different notions corresponding with monolythical solutions with respect to client-server solutions, as well as with differences due to the varying amount of readymade software (reusable components, application packages, complete applications, application frameworks, etc.) integrated in the system. Quite often, the integration of such systems requires that the basic data structures and the basic notion of interfaces has to be adapted to the foreign software included. This might reach a level, where the architecture of the framework used is carried over to the newly developed system. Such standard architectures are relatively easy to formalize and may constitute a clue in the reverse engineering process. 3.3.4 Design patterns With design patterns, we arrive at a lower level of abstraction, but still at concerns above the code level. We attribute these considerations to the developer organization. Therefore, one might assume, that they are relatively easy to formalize and to identify. However, one has to recognize that successive maintenance operations might lead to quite a bit of clutter in this respect. Shifts of style, usage of different tools and other extraneous aspects which might change over the lifetime of a system might yield mixtures of style which are quite hard to identify. Hence, as of current, we strive for the following kind of design patterns which can be identified under tool support:

3.3.5 Language patterns With language patterns we refer to those patterns, which are due to typical C-style, Pascal-style, COBOL-style, or LISP-style, to mention just a few non-oo extremes. They are due to different possibilities rendered by different programming languages and due to the differences in style which appeared henceforth. To highlight this notion, we mention just a few of such language patterns: coding conventions, data structures (instantiations, initializations), source-code constructs (loops, etc.), or basic algorithms (sorting, searching, reversing, etc.). Coding conventions are probably those aspects which would immediately come to mind. Some of them can be formulated in patterns. The situation is similar to naming conventions, since they constitute (on the level of the actual programming language used) patterns which are language-specific (i.e. general), organization-specific, and programmer-specific. Hence, the set of predefined patterns has to be augmented in a learning process. More than with naming conventions, one should beware though, that using them, implies entering relatively shaky grounds, since – assuming long living systems – generations of programmers might have used different patterns, the latter violating the former ones, or ad-hoc maintenance might have ignored them alltogether. Hence, usage of coding conventions to facilitate reverse engineering should always be rather cautious. With instantiations and initializations we are address-

 data definition, data model (tables, files, etc.)  definition of control flow (decomposition)  internal communication patterns (data flow, blackboard, etc.)  definition of modules/objects and their interrelationships (decomposition structure)  human/computer interaction (HCI) definition  task management  naming conventions Amongst them, naming conventions are certainly the easiest to track down and to use within a given project as well as across projects. Thus, the naming pattern part is 6

human engineer to introduce his domain knowlege in identifying such recurring patterns and his capability of managing uncertainty issues in resolving (to some degree) overlapping patterns. But having a given set of patterns on different levels of abstraction at hand, facilitates this demanding task. Improvements for the reverse engineering process cover the generation and further use of specific patterns and, thus, a reduction of the human intervention in the reverse engineering process. Next we will discuss the improvements of patterndriven reverse engineering in turn. For this, we reconsider the clustering of our patterns in those directly derivable from the solution space and those derivable from the creative and the problem space (see Figure 3). Since definition and use of patterns for reverse (and even forward) engineering seem obvious, we concentrate on describing specific situations where their pay-off becomes especially evident.

ing on one hand the issue as to whether the programming language provides for certain defaults or not, on the other hand, the related question, when and where such initializations (or instantiations) are usually carried out. With source-code constructs, we are referring to certain recurring structures which take a concrete syntactic form within a given programming language. As an example one might take the basic pattern of alternatives and recursion in LISP or the specific nature of WHILE constructs in Algollike languages, where from the WHILE-condition one can infer, which variables drive the loop and thus consider the slice which set and reset these variables. The bullet on “basic algorithms” is a higher level concern to be addressed within a given language framework. We refer here to those aspects, which might – if the problem warrants it – be relegated to a specific library routine, but which are due to whatever reason directly executed in newly written code. We think of them as rather powerful patterns [14] which emerge recurrently in specific situations and which might warrant that specific stereotypical patterns are held against those portions of code where one can see a chance of identifying them. We consider them difficult to identify though, because in most situations one has to expect that such patterns would not appear in a clean and straightforward form, but rather interwoven with other code (e.g. manipulation of complex data structures, intermediate computations, etc.).

3.4

4.1 Improvements from solution-space patterns Design patterns support in deriving information that is usually burried somewhere in the source code such as data/control flow or human/computer interaction. Deriving objects that are not permanently stored and can only be discovered via the input and output at the user interface can significantly benefit from such patterns. The identification process can speed up since general communication patterns and user interface items are at hand to be matched against “suspicious” source code segments. For language patterns we identify in particular two situations, where their use might be helpful:

Interacting Patterns

Above, we identified a set of patterns which result from the specific conditions under which software development takes place. If we consider software development as a problem solving activity which is conducted on the basis of the developers education and experience and the customers domain and interests, we could go to the extreme of viewing code as a complex subject resulting from the interference of various problem and solution patterns, expressed on different levels of abstraction (and at different points in time). If one adopts this extreme position, the COREM tool-set would be an aid for the detective, who aims at isolating such patterns, such solving the puzzle. Unfortunately, this is too simplistic a perspective. Not all of the patterns underlying the process can be well formalized and thus automatically uncovered. But to uncover at least the most recurring already greatly reduces the amount of human intervention needed during reverse engineering. The following section concentrates on these aspects.

 controlling the search strategy when applying hard and fast patterns such as the language/data structure patterns mentioned above, patterns due to using special tools or application generators, or the application patterns discussed below. While this implies early application, using an incorrect strategy will not yield incorrect results but just an inefficient heuristic.

 resolving clutters. These are situations, where intertwining between control and data structure lets plain methods fail. In these situations, patterns describing conventions (application-, company-, or domainspecific stylistic standard situations) will be applied only after the syntax-oriented patterns. Furthermore, such results will be marked as intermediate suggestions, as we did in other occasions during our reverse engineering process where intermediate solutions had to be pinned down, but only a later step could identify the correct option (see also [11]).

4 Improvements of the reverse engineering process The complex interference of problem and solution patterns in a reverse engineered program demands from the 7

4.2

Improvements from creative- and problem-space patterns

It is to be acknowledged, that there are still quite a number of detailed problems to be resolved in this model driven scavenging approach. But we are certain that the technology we have at hand will finally adequately support it.

Application patterns that constitute recurring patterns of the problem space stratify the customers problem into general domain model (GDM), customer model (CuM), application domain model (ADM), and application model (AM). The aspirations of using such patterns are threefold:1

The various influences of software solution patterns (e.g. application domain, developer and customer organization) make it difficult to formalize such patterns. As mentioned above we provide for this slot, but we refer to further research. In contrast to software solution patterns architectural patterns are relatively easy to formalize. Standard communication architectures like CORBA, DCE, ODP, or ANSAware constitute such collections of related patterns and provide special interfaces for platform independent cooperation of applications. Such architectures represent domain independent communication concepts that in case of re-engineering constitute real-world architectural patterns. In case of extending the life-span of legacy applications they seem beneficiary for modernizing or even completely changing their architecture.

 Instantiable domain model. It is more economical, to encode domain knowledge at the problem space level and instantiate it in a refinement (and even correction) step to the solution level. In this case, a domain model provides a framework which is to be filled in a relatively conventional forward requirements modeling process. The process is speeded up though, since it has a well defined non-empty point of departure and is guided by the structure (not necessarily the comprehensiveness) of the domain model. Instantiations, once made, can be saved, thus allowing future reverse engineering efforts in related applications to build upon the already instantiated portions of the domain model.

5 Conclusion

 Demand driven instantiation. This would allow for a demand driven injection of application knowledge. The bottom up reverse engineering process yields the RooAM as described earlier. Whenever information of the FooAM is needed though, the bottom up process does not assume the complete existence of a FooAM, but rather consults the domain model and then, possibly benefiting from application patterns discussed above, asks only for the information needed specifically to map the RooAM to the FooAM.

The weakness of conventional reverse engineering approaches in supporting the identification of recurring patterns led us to the investigation of application-specific knowledge for the automated use in reverse engineering tasks. In this paper, we identified different kinds of patterns on the basis of our object-oriented reverse engineering method COREM and several essential factors influencing software development. We discussed how design recovery patterns can improve reverse engineering results in terms of our method and where the problems and limits in examining long-lived software systems (mainly caused by unstructured maintenance operations) are to be found. Currently, we continue extending our COREM tool-set to integrate the patterns discussed. This helps in reducing the human intervention usually necessary during reverse engineering. Future research will concentrate on further formalizing these patterns escpecially where, at present, only slots can be provided. Together with the COREM tool-set the empirical basis for our patterns will also be provided.

 Model driven scavenging. Here, we explicitly change aim and perspective. While the previous discussions are stipulating a complete transformation from a procedural to an object-oriented architecture, we now consider situations where the quest for some reusable (object) component should be satisfied from the code of an otherwise untouched procedural system. The entrypoint to such a search and reverse engineering process would be a search pattern on the level of abstraction of a FooAM object. Relating it via the domain model to application patterns will help to identify those portions of a procedural system which are needed to build such an object, provided these portions are at hand. Actually, the automated steps are performed as shown, but the knowledge intensive portions are confined to those aspects of the application, where they will pay off.

References [1] T.J. Biggerstaff. Design recovery for maintenance and reuse. IEEE Computer, 22(7):36–49, July 1989. [2] T.J. Biggerstaff, B.G. Mitbander, and D. Webster. The concept assignment problem in program understanding. In Proceedings of the 15th International Conference on Software Engineering, Baltimore, Maryland, pages 482–498. IEEE Computer Society Press, May 1993.

1 We rather use “domain model” for the following discussion instead of referring to each of the models (GDM, CuM, ADM, and AM) separately.

8

[3] T.J. Biggerstaff, B.G. Mitbander, and D.E. Webster. Program understanding and the concept assignment problem. Communications of the ACM, 37(5):72–83, May 1994. [4] B.K. Das. A knowledge-based approach to the analysis of code and program design language (pdl). IEEE Conference on Software Maintenance, pages 290–296, October 1989. [5] K. Fukunaga. Prompter: A knowledge based support tool for code understanding. In Proceedings of the 8th International Conference on Software Engineering, pages 358– 363. IEEE Computer Society Press, August 1985. [6] H. Gall and R. Kl¨osch. Program Transformation to enhance the Reuse Potential of Procedural Software. In ACM Symposium on Applied Computing, SAC ’94, pages 99–104. Phoenix, Arizona, ACM Press, March 1994. [7] E. Gamma, R. Helm, R. Johnson, and J. Vlissides. Design Patterns: Elements of Reusable Object-Oriented Software. Addison-Wesley, 1995. [8] M.T. Harandi and J.Q. Ning. Knowledge-based program analysis. IEEE Software, 7(1):74–81, January 1990. [9] W. Johnson and E. Soloway. Proust: Knowledge-based program understanding. IEEE Transactions on Software Engineering, 11(3), March 1985. [10] L. Markosian, P. Newcomb, R. Brand, S. Burson, and T. Kitzmiller. Using an enabling technology to reengineer legacy systems. Communications of the ACM, 37(5):58–71, May 1994. [11] R.T. Mittermeir, R.R. Kl¨osch, and H.C. Gall. Object Recovery from Procedural Systems for Changing the Architecture of Applications. In Proceedings of the 3rd International Conference on Systems Integration (ICSI ’94), Sao Paulo, Brazil. IEEE Computer Society Press, August 1994. [12] J.Q. Ning, A. Engberts, and W.V. Kozaczynski. Automated support for legacy code understanding. Communications of the ACM, 37(5):50–57, May 1994. [13] S. Paul and A. Prakash. A framework for source code search using program patterns. IEEE Transactions on Software Engineering, 20(6):463–475, June 1994. [14] Ch. Rich and L.M. Wills. Recognizing a program’s design: A graph-parsing approach. IEEE Software, 7(1):82–89, January 1990. [15] H. Ritsch. Heuristische Wiederaufbereitung: Ansatz zur konzeptorientierten Redokumentation von gewachsenen Applikationen. PhD thesis, Institut f¨ur Informatik, Universit¨at Klagenfurt, Austria, 1994. [16] R.C. Waters. The programmer’s apprentice: Knowledge based program editing. IEEE Transactions on Software Engineering, 8, 1982.

9