AN AUTOMATED REFACTORING APPROACH TO DESIGN PATTERN-BASED PROGRAM TRANSFORMATIONS IN JAVA PROGRAMS Sang-Uk Jeon, Joon-Sang Lee, and Doo-Hwan Bae Division of Computer Science Department of Electrical Engineering and Computer Science Korea Advanced Institute of Science and Technology fsujeon, joon,
[email protected] ABSTRACT Software often needs to be modified for accommodating requirements changes through the software life cycle. To deal with several accidental requirements changes related to software maintenance, a systematic and safe approach to modifying softwares is needed. The design patterns provide a high degree of design flexibility for such accidental requirements changes. In this paper, we propose an automated approach to refactoring based on design patterns in Java programs. In our approach, for a particular design pattern, we define an inference rule to automatically identify a set of candidate spots and a refactoring strategy to transform one of the set of candidate spots into the desired design pattern structure. A candidate spot may be a class or a set of classes to which the design patterns can be applied. We believe that our approach would be helpful to the maintainers in the sense that much of manual analysis on source code can be reduced, and the fashion of automated program transformation preserves the behavior of the original program by means of the refactoring technique. 1. INTRODUCTION Through the software life cycle, most of the developed software systems need to be modified for accommodating requirements changes, or their program modules must be appropriately adapted so as to be reused in other application contexts. Very often, this kind of maintenance activities requires a very high cost effort, in the sense that the program modules expensively certified through robust verification and testing activities must be certified again against the changes to the software system, and such activities continue to occur until its retirement. To cope with such problems related to software maintenance, we need a systematic and safe approach to modifying the software system, which may be either of a running one or legacy one. Refactoring [1] is a safe program restructuring technique that can guarantee the restructured program to preserve the
behavior of the original program. Various refactoring techniques have been proposed to modify the software structure which can be represented in the class diagram of UML, and also to transform the program in an automated way based on some quantitative metric or static analysis on the source code. In common, a refactoring technique provides a set of primitive refactoring operations, and allows such primitive operations to be sequentially and combinatorially composed into a high-level refactoring rule, which may involve several software quality issues such as higher modularity, performance improvement, lower code redundancy, and so on. Design patterns [3] are a collection of recurring patterns of relationships among classes, objects, methods, etc. Each design pattern describes its own application area, structure / behavior pattern, and rationale. Thus, the rationale and application area can be used as a good goal for high-level refactoring. The design patterns are widely used because they provide design-level reusability, understandability, and flexibility. Especially, the flexibility affords to be focused on in the sense that the program can be formed more openended and less sensitive to the requirements change. Many prior approaches to automated introduction of the design patterns have focused on only how to transform a given software system into a new one which have certain desired design pattern by applying a sequence of primitive refactoring operations defined for class-structure transformation [7, 9, 11]. In other words, there has been no approach to automatically recognizing a set of candidate spots where to apply some design pattern-based refactoring in the source code. After inspecting a target source code sufficiently, the designer can decide what design pattern could be applied to which spot, then a tool helps the program transformation automatically. The tool only helps tedious modifications. If the size of source code is very huge, even if worse, there is no related document, it becomes extremely difficult to inspect all of the source code for understanding in detail. Therefore, the automated and systematic way of identifying candidate spots in which a certain design pattern
transformation can be applied is strongly needed. In this paper, we propose an automated approach to identifying the candidate spots, then on which the defined refactoring strategy would be applied for the program transformation into a particular design pattern. To do this, we elaborate a set of primitives in the form of logical predicates, which extract the structural and behavioral design information from a given program source code, enough to describe the inference rule for each typical design pattern. Even though it is hard to prove the automated identification of the candidate spots in a fully reliable way, we believe that the application of our approach can help the designer maintain the software system at a lower cost and in a safer way. The remainder of this paper is organized as follows. In Section 2, we present the assumption, overview and key techniques of our approach . Section 3 presents the Abstract Factory design pattern as an application example. In Section 4, we address several issues how to implement a prototype tool supporting our approach. In Section 5, we discuss the prior work related to our approach and finally in Section 6, we conclude with some future work. 2. APPROACH 2.1. Assumption Notably, our approach refers to the modification history of a target program to get the derivation data of evolution. At the maintenance stage, it often occurs that the developers or maintainers extend or refine several classes in the previous source code to meet changing requirements. Design patterns are useful especially when the applied design spot tends to evolve frequently. For the purpose of getting information about the modified spots, we compare two or more versions of the program. Although, there could be more than two versions of the program in the course of maintaining activities, for simplicity we assume that there are only two versions.
Reflected to Modification history
Program
A new version of program Confirmation by human experts, or quantitative quality metrics
Consolidating a set of related programs
Inferencing
Design model represented as predicate
PROLOG facts
Design pattern inference rules
PROLOG rules
Transforming
Identified candidate spots
PROLOG query
A chosen spot
Expert's decision or automatic decision by the degree of similarity
Refactored source code
Automated transformation by refactoring strategy
Figure 1: The overview of our approach.
rules are converted to PROLOG rules, and then, by PROLOG query, the candidate spots, which mean the code that can be transformed into the designated design patterns, are identified. The details on the inference rules are described in section 2.4. After completing the inferencing part, the designer examines whether it is possible and reasonable to transform each of the candidate spots into the design patterns, or not. In the transformation part, the examined spots are transformed using a set of refactoring operations according to a refactoring strategy, and a new version of the program is released. The refactoring strategy is a kind of an algorithm that describes how to transform a given candidate design spot into the corresponding design pattern. Typically, it results in a sequence of refactoring operations. The prior work on program modification with design patterns has only addressed the transformation technique itself, but there exists further research opportunity: automated identification of appropriate spots on the program, to where some reliable program transformation could be applied. To do this, we propose a systematic way to deductively represent and infer design pattern-based program transformations.
2.2. Overview Figure 1 shows the overview of our approach. Our approach consists of two parts: an inferencing part for identifying a set of candidate design spots and a transformation part for restructuring a chosen candidate design spot according to a refactoring strategy. The input of the inferencing part is a set of historically related Java programs. By consolidating the programs, the design model is extracted and represented by Prolog-like predicates. The syntax and semantics of the predicates are much like those of Prolog. In the following subsection, the detailed descriptions of the predicates are presented. A design model that is represented as the predicates is converted to a set of PROLOG facts, and the design pattern inference
2.3. Formal definitions of the predicates To reason about the possibility of an intended design pattern transformation, we need to get the design-level information such as the class and interface hierarchies, attributes and methods of classes, and relatively detailed statements about the call of methods. The prototype tool reads two versions of a program and extracts a set of design information in the form of the logic predicates. In this section, we present the formal descriptions of the logic predicates. An object-oriented program P is represented as a 8tuple (C , I , M, R, L, V , , ), where C is the set of classes, I is the set of interfaces, M Co TEXT Cr Cp1 :: Cpi :: Cpn is the relation relating an
owner class Co , a method name TEXT , a return type C r , and a list of parameter types C p1 ::Cpn . R (C C ) [ (C M) [ (M C ) [ (M M) [ (I I ) [ (C I ) [ (M C C ) [ (M C C TEXT ) is the relation representing the relationships between classes, interfaces and methods such as association, aggregation, inheritance and implements relation, and the relationships between classes and methods such as method call, object creation, returnvalue or parameter-passing dependency. L : C[I ! TEXT is the function mapping a class C or an interface I into its name TEXT , where TEXT is the set of strings. V : R ! VISIBILITY = fprivate; protected; publicg is the function mapping a relation R to an element in VISIBILITY . : M ! Co TEXT Cr Cp1 :: Cpn is the function mapping a method M to the owner class C o , the name of the method TEXT , the return type C r , and a list of parameter types Cp1 ::Cpn . For example, (m) o denotes the owner class of method m, (m) n denotes the name of m, (m)i denotes i positioned parameter type of the method m. : C C TEXT 7! MULTIPLICITY = fzero or one, one, one or moreg is the partial function mapping a relation from the class c1 to c2 with the reference name TEXT , to an element in MULTIPLICITY . We classify the predicates into three types: the set of predicates that can be directly extracted from a Java program, the set of predicates that need some analysis and inference, and the set of history-based predicates that need more than two historically related Java programs. First, we present the predicates that can be directly extracted from a Java program.
inherits(i1:I , i2:I ) - If the condition: “(i1; i2) 2 R” holds, and the interface i1 inherits from the interface i2 using extends keyword.
creates(m:M, c:C ) - If the condition: “(m, c) 2 R” holds, and there exists some statement within the method m which creates an object of the class c using new keyword.
make assoc (m:M, c1:C , c2:C , t : TEXT ) - If the condition: \(m, c1; c2, t) 2 R” holds, and there exists some statement in method m that sets the attribute t of c1 to an object of c2.
creates(c1:C , c2:C ) - If the condition: “(c1; c2) 2 R^9m 2 M((m)o = c1 ^ creates(m; c2))” holds. returns(m:M, c:C ) - If the condition: “(m, c) 2 R” holds, and there exists some statement within the method m which returns an object of the class c using return keyword.
returns(c1:C , c2:C ) - If the condition: “(c1; c2) 2 R^9m 2 M((m)o = c1 ^ returns(m; c2))” holds. make aggre(m:M, c1:C , c2:C ) - If the condition: “(m, c1, c2) 2 R ^ 9 m1 2 M (calls(m; m1) ^ (m1)o = c1 ^ 9i n ((m1)i = c2))" holds, where n is the number of parameters of the method m1.
inherits(c1:C , c2:C ) - If the condition: “(c1; c2) 2 R” holds, and the class c1 inherits from the class c2 using extends keyword.
implements(c:C , i:I ) - If the condition: “(i; c) 2 R” holds, and the class c implements the interface i using implements keyword. references(m:M, c:C ) - If the condition: “(m; c) 2 R” holds, and there exists some statement within the method m that declares an attribute whose type is the class c. references(c1:C , c2:C ) - If the condition: “(c1; c2) 2 R” holds, and there exists some statement within the class c1 that declares an attribute whose type is the class c2 or the condition: “9m 2 M ((m) o = c1 ^ returns(m; c2))” holds. calls(m1:M, m2:M) - If the condition: “(m1; m2) 2 R” holds, and there exists some statement within the method m1 that calls the method m2. calls(c:C , m:M) - If the condition: “(c; m) 2 M ((m0 )o = c ^ calls(m0 ; m))” holds.
R ^9m 2
calls(m:M, c:C ) - If the condition: “(m; c) 2 M ((m0 )o = c ^ calls(m; m0 ))” holds.
R ^9m 2
0
0
calls(c1:C , c2:C ) - If the condition: “(c1; c2) 2 R ^ 9 m1, m2 2 M ((m1)o = c1 ^ (m2)o = c2 ^ calls(m1, m2))” holds.
There are a set of predicates that cannot be extracted directly, but can be inferred through heuristic and static analysis on the source code. associates(c1:C , c2:C ) - If the condition: “calls(c1; c2) ^ ref erences(c1; c2)” holds, then there is a association relationship from the class c1 to the class c2. aggregates n(c1:C , c2:C ) - If the condition: “associates(c1, c2) ^creates(c1; c2)” holds, then there is a normal aggregation relationship from the class c1 to the class c2. aggregates c(c1:C , c2:C ) - If the condition: “aggregates n (c1, c2) ^ V (c1; c2) = private ^ 9t 2 TEXT ( (c2, c1, t) = one)” holds, then there is a composite aggregation relationship from the class c1 to the class c2.
delegates(m1:M, m2:M) - If the condition: “(m1; m2) 2 R ^ calls(m1; m2) ^ (m1)o 6= (m2)o ^ (: 9m3 2 M (calls(m1; m3) ^ (m3)n 6= (m2)n ) _ 8m 2 M (calls(m1; m) ! (m1)n = (m)n ))” holds. delegates(c1:C , c2:C ) - If the condition: “(c1; c2) 2 R ^ 9 m1; m2 2 M ((m1)o = c1 ^ (m2)o = c2 ^ delegates(m1; m2))” holds.
p1 -> q1
Structural and Behavioral Properties of Program (Axioms)
Implies
p2 -> q2
Extracted design model
p3 -> q3
A delegation is assumed, if only one method is called or if the called method has the same name as the calling method. Its heuristic is borrowed from [20]. In addition to those predicates described above, we define three history-related predicates. If there are two versions of a program P = (C , I , M, R, L, V , ) and P 0 = (C 0 , I 0 , M0 , R0 , L0 , V 0 , 0 ), then the predicates are defined as follows: refines(m1:M, m2:M0 ) - If the condition: “(m1) n = 0 (m2)n ^ 9 c1 2 C , 9 c2 2 C 0 ((m1)o = c1 ^ 0 (m2)o = c2 ^ L(c1)=L0 (c2))” holds, where the bodies of m1 and m2 need not to be identical. refines(c1:C , c2:C 0 ) - If the condition: “L(c1) = L 0 (c2)” holds, where the bodies of c1 and c2 need not to be identical. isomorphic (C 1: 2C , C 2: 2C ) - If the condition: “8x 2 C 1 (9y 2 C 2 (x = y _ ref ines(x; y )) ^ (:9x0 2 C 1 ((x; x0 ) 2 R)_9y; y 0 2 C 2((x = y _ref ines(x; y ))^ (x0 = y 0 _ ref ines(x0 ; y 0 )) _ (y; y 0 ) 2 R)))" holds. The refines(c, c’) predicate represents the following case. If there are two versions of program P and P 0 , where P 0 is the later version, and the class name of c exists both in P and P 0 . However, their bodies don’t need to be identical: c in P 0 is refined version of c in P . Deciding whether c 0 in P 0 is derived from c in P depends on their names. Although, two corresponding classes could be completely different to each other, or the maintainer could intentionally change the name of the class, those problems can be easily handled by a renaming map provided by the maintainer. The isomorphic(C1, C2) predicate means that the two sets of classes are isomorphic in terms of topology: each element of C2 is equal to or derived from some element of C1, and if there is a relationship between the elements in C1, there must be a corresponding relationship between the equal or derived elements in C2. 2.4. Inference rule and refactoring strategy Figure 2 shows the schematic structure for reasoning out the design pattern candidates. The extracted design models from two versions of a program exhibit a set of structural and behavioral properties respectively. Each rounded box in
Inference rule container
Figure 2: The recognition of design pattern candidates.
the inference rule container represents an inference rule for a particular design pattern. An inference rule is defined as a logic implication with role components as its input parameters. The reasoning structure of design pattern recognition is constructed as follow :
P ! A Design Pattern(c1 ; c2 ; :::; cn ) The premise P of the above implication is a logical expression consisting of the predicates provided in this paper and the connectives provided by Prolog. If the implication is proven true, then the design spots c 1 ,c2 , ..., and cn in program P are mapped to the role components of the design pattern. The role component is a class that plays an important role in a particular design pattern. In the inferencing part, given the structural and behavioral properties of an extracted design model, the tool delegates the role of checking if a spot of design properties implies the premise of some inference rule to the query processor of Prolog. If such pair (design properties, premise) exists, then a set of classes related to the matched design properties become the candidate design spot. A candidate spot consist of a set of classes, each of which has the probability of being the corresponding one to a role component of the design pattern. Given the candidate spot, the program is transformed through a sequence of refactoring operations resulted from the refactoring strategy. We develop a pair : (inference rule, refactoring strategy) for each design pattern. The refactoring strategy is a description of how to transform the chosen candidate spot into the design pattern in an algorithmic way. We use the set of refactoring operations proposed by Cinneide and Nixon [9].
AbstractFactory CreateProductA() CreateProductB()
Client
C7
ProductA2 ConcreteFactory1
ConcreteFactory2
CreateProductA() CreateProductB()
CreateProductA() CreateProductB()
C8
C4
AbstractProductA
ProductA1
C1
C5
C2 Client
AbstractProductB
C6
C3
Legends :
ProductB2
Closed set Aggregation Creation dependency Association
ProductB1
Figure 3: The typical structure of Abstract Factory design pattern.
Figure 4: Grouping the product classes into the transitivelyclosed sets. C1
3. APPLYING TO THE ABSTRACT FACTORY DESIGN PATTERN
ConcreteFactory
Client
C2
3.1. Inference rule for identifying the concrete factory Usually, the set of inference rules are developed based on the heuristic and knowledge of human experts. Currently, we are on developing the inference rules for a few design patterns. In this section, we will not describe all of them, and just present an inference rule for the Abstract Factory design pattern as an example. Figure 3 shows the typical structure of the Abstract Factory design pattern [3]. Normally, a single instance of some ConcreteFactory class is created by the client at runtime. The concrete factory creates, composes, and represents a set of family products. For a different family products, the client should use a different concrete factory. The Abstract Factory design pattern has the following benefits.
Because a factory encapsulates the responsibility and the process of creating a family of products, it separates the client from the detailed implementation of the products. When the product objects in a family are designed to work together, the Abstract Factory design pattern can enforce this constraint.
The Abstract Factory design pattern is widely used when a program structure should be configured with one of multiple families of products, or a family of products is designed to be used together. The goal of finding candidate spots for the Abstract Factory design pattern is to identify the set of family products. First, we consider how to find a candidate spot for a concrete factory. We regard a set of product classes that are created by a client class, and each transitive closure of relationships among those product classes as a set of family products in the Abstract Factory design pattern.
createC1() createC2() createC3()
Client
C3
Figure 5: Identifying concrete factory classes.
For example, suppose that a client class c can create and manage the objects of a set of product classes:fc 1 , c2 , .. , cn g. Several relationship links of association or aggregation could be established among those product objects by the client class c. Let’s view each product class and each relationship as a node and an edge in an undirected graph, respectively. In such a graph, we can group those product classes into the transitively-closed sets. For example, suppose that a client class creates 8 classes as shown in Figure 4, then we can get 3 groups of closed sets. A transitively-closed set of product classes means a product family that would be created and managed together by the client class c, so that it can be identified together with the client class c as a candidate spot for the concrete factory, as shown in Figure 5. If there is a program P =(C , I , M, R, L, V , , ), the inference rule for identifying concrete factory can be represented formally as follows.
9 c 2 C , P C C (8 x 2 P C (creates(c; x) ^ (9 t 2 TEXT , m 2 M, y 2 P C (x 6= y ^ (m)o = c ^ ((make assoc(m; x; y; t) _ make assoc (m; y; x; t)) _ (make aggre (m; x; y) _ (make aggre (m; y; x)) )))))) ! CandidateConcreteF actory(c; P C )
The CandidateConcreteFactory(c, PC) is a high-level predicate for identifying a candidate spot for the concrete
C1
Client'
Input c : the client class c1, c2, ... , cn : product classes in prior version c1', c2', ... , cn' : product classes in posterior version
AbstractC1
Prior version
Client
C2 AbstractC2
C3
AbstractFactory
Procedure for each ci and ci' do if ci = ci' then create abstract interface ACi for ci define all operations of ci to ACi else create abstract interface ACi for ci and ci' define all common operation between ci and ci' to ACi endif repeat
createC1() createC2() createC3()
AbstractC3
C1
Posterior version
Client'
C2
ConcreteFactory
ConcreteFactory'
createC1() createC2() createC3()
createC1() createC2() createC3()
Create abstract factory interface AF Define n operations in AF, each of which creates and returns ACi
C3'
Create concrete factory class CF1 Define n operations in CF1, each of which creates and returns ci
Figure 6: The identification of the abstract factory.
Create concrete factory class CF2 Define n operations in CF2, each of which creates and returns ci'
factory class, where the class c is a client class which uses the factory, and the set of classes PC is the set of family products in a concrete factory. 3.2. Inference rule for considering the abstract factory In Section 3.1, we identify the set of classes that could be created in a concrete factory. Now, we present how to identify the abstract factory for each concrete factory. In this step, our approach uses the modification history. Suppose that in a program P 1 a candidate spot for the concrete factory whose role components are the client c and a family of products c 1 , .. , cn , is identified. Similarly in a program P 2 , that is a refined version of the program P , a candidate spot for the concrete factory whose role components are the client c0 and the family products c 1 ’, .. , cn ’, is identified. If every product class in c 1 , .. , cn has a corresponding one in c 1 ’, .. , cn ’, the two families of product classes possibly have the same abstract factory class as their superclass. Such abstract factory class declares the common operations to be implemented in its concrete factory classes. If there is a program P ’=(C ’, I ’, M’, R’, L’, V ’, ’, ’) that is refined from the program P , the inference rule for the Abstract Factory design pattern is formalized as follows :
9 c 2 C , P C1 C , c 2 C , P C2 C (ref ines(c; c ) ^ isomorphic(P C1 ; P C2 ) ^ CandidateConcreteFactory (c; P C1 ) ^ CandidateConcreteF actory (c ; 0
0
0
0
0
PC2 )) ! CandidateAbstractF actory (c; c0 ; P C1 ; P C2 )
The CandidateAbstractFactory (c, c , P C1 , P C2 ) is a high level predicate for identifying a candidate spot for the Abstract Factory design pattern, where if the set of product classes P C1 for a client class c and the set of product classes P C2 for a client class c0 are isomorphic1 , they can 0
1 Two
class networks are isomorphic if and only if they have the same
Add the statement that creates CF2. Replace the statements each " new ci' " to "cf2.createCi" Replace the statements which uses ci' to ACi
Figure 7: The refactoring strategy.
be constructed as the sibling concrete factory classes inheriting from the same abstract factory class. The heuristic of the inference rule for the Abstract Factory design pattern is shown graphically in Figure 6. 3.3. Refactoring strategy In this section, we briefly describe on how to transform a candidate spot into the Abstract Factory design pattern. Given a client class c, and two product families identified as the candidate spots for the concrete factory classes, we must create an abstract factory class for each product family via the Java interface. Then, each of the concrete factory classes realizes its own abstract factory class. After that, within the client class c, all of the Java statements that create the product classes are substituted by the corresponding factory methods defined in the concrete factory object, via the abstract factory class. The algorithm is described in Figure 7. Each refactoring step in the algorithm preserves the behavior of the target program. 3.4. Example Application In this section, we illustrate a simple application for the Abstract Factory design pattern. Figure 8 shows the design of a simple program that creates a Honda Prelude with a VTEC2 2 engine and GY184 HR14 tires. It is borrowed number of classes and the same relationships. This can be expressed by the isomorphic predicate defined in Section 2.3
App
App
AccordCar
PreludeCar
GY184_HR14 Tires
Bridge184_SR14
VTEC2_2
Tires
Engine
AccordEngine Engine
Figure 9: A posterior version.
Figure 8: A design of a simple program.
App
from [7]. Assume that there is a posterior version as shown in Figure 9. The posterior version of the program creates a Honda Accord with a Accord Engine and Bridge184 SR14 tires. With these two versions of the program, the following design information is obtained :
CarFactory
Tire
Car
Engine
MakeCar() MakeTire() MakeEngine()
creates(App, AccordCar) PreludeFactory
creates(App, Bridge184 SR14) Bridge184_SR14
Accord
AccordEngine
creates(App, AccordEngine) refines(PreludeCar, AccordCar) refines(GY184 HR14, Bridge184 SR14) refines(VTEC2 2, AccordEngine)
With this design information, an inference rule for the Abstract Factory design pattern can be applied, where the client is App, and P C1 is the set consisting of PreludeCar, GY184 HR14, VTEC2 2, and P C 2 is the set consisting of AccordCar, Bridge184 SR14, AccordEngine. After that, using the refactoring strategy for the Abstract Factory Design Pattern, the program is transformed as shown Figure 10. It might seem more complex than the original program, but if you try to change the constituent components of the car that the App class creates, the program structure formed with the Abstract Factory design pattern provides more flexibility and expansibility. 4. IMPLEMENTATION ISSUE Currently, we are on developing a prototype tool that supports our approach in an automated way. For the first step, the tool considers Java programs as the target language environment. To extract the design information of a Java program, it is necessary to analyze its source code. We use JavaCC(Java Compiler Compiler) [13] for parsing the source code and constructing its parse trees. JavaCC is a widelyused parser generator for use of Java applications. Our approach is based on the modification history, so the user can
Figure 10: A new version with Abstract Factory design pattern.
choose a set of folders, each of which contains one version of the program, and designate the predecessors and successors in the modification history. Then the tool compares those design models, and constructs historically-related information. For simplicity, our approach assumes that there are only two versions of a program, but N versions of a program will be considered on the official tool released. After parsing and reasoning about the design model from a program, the tool converts it to a set of PROLOG facts, and the inference rules to a set of PROLOG rules. It is allowed to define multiple inference rules for a design pattern, so the tool is designed to offer the user options to select more proper rule. After choosing an inference rule, the tool converts the chosen inference rule into a set of PROLOG rules, and starts identifying the candidate spots through the PROLOG query. After that, the tool visualizes which parts could be candidate spots for each design pattern. We use Dot [14] for visualizing design model. After checking the candidate spots, the user chooses which of the candidate spots will actually be transformed. As mentioned, our approach uses the refactoring technique proposed by Cinneide and Nixon [9]. It defines minipatterns and corresponding minitransformations. In the prototype tool, the
design pattern transformation is performed using a composition of minitransformations, and the way of composition is determined by refactoring strategy. They argue that each ministransformation preserves the behavior of the old program, therefore the design pattern transformation preserves the behavior. 5. RELATED WORK The refactoring [1, 12] has been investigated by many researchers for the last decade as a promising technique to improve the quality of object-oriented programs through the behavior preserving program transformation. William Opdyke [1] proposes a suite of low-level refactorings that can be applied to C++ programs with which higher-level refactorings such as the conversion of an inheritance relationship to an aggregation relationship can be defined. Recently, Fowler summarized his practices and experience in the refactoring work applied to several industrial projects in [12]. There have been several pieces of work trying to introduce the design patterns to the legacy programs through refactorings. The common goal of such work is to remove the burden of tedious and error-prone program modification activities from the maintainer. Schulz and et al [6] propose a refactoring technique to introduce the design patterns in C++ programs. Tokuda [7] shows that a typical system can evolve significantly faster and cheaper via refactoring and an automated introduction of the design pattern via the manual scripts. Recently, Cinneide and Nixon [9] develop a set of minipatterns and the corresponding minitransformations that can deal with various design patterns. Also, they suggest the precursor as the starting point of the program transformations. Mens and Tourwe [11] use declarative metaprogramming technique to specify, generate, and transform the source code of a program by refactoring aimed to introduce the design patterns. There have been several people arguing that introducing design patterns for a particular requirements change is undesirable because it might introduce some unnecessary complexity to the software system [9]. However, Prechelt and Tichy [22] address that, by experiments, unless there is a clear reason to prefer the simpler solution, it could be wise to choose the extensibility provided by the design pattern solution, because unexpected new requirements often occur. 6. CONCLUSION We have proposed an automated approach to refactoring based on the design patterns in Java programs, where for a particular design pattern, a pair : inference rule and refactoring strategy is defined. First of all, we have defined the formal model of an object-oriented program as 8-tuple. Based
on the formal model, a set of predicates is defined to represent the structural and behavioral design properties of a program via Prolog-like predicates. In our approach, how to identify the candidate spots for a particular design pattern is formulated as an inference rule. To a candidate spot confirmed by the maintainer, the defined refactoring strategy is applied. To explain our approach in detail, we have presented the Abstract Factory pattern [3] as an example of the inference rule and refactoring strategy. During the maintenance phase, program modification occur very often. Design patterns enable program solutions to be constructed flexible as well as sharable as idioms. So, when applying design patterns in describing an application program, we can manage the maintenance cost more efficiently. In order for the maintainers to introduce the design patterns to legacy program, there is a need to understand most of its code manually. In case of large-scale software, such work sure turns into a werewolf meaning an immense of complexity and costs. Our approach alleviates the burden of such tedious, time-consuming, and error-prone tasks, so the maintainers can focus on more creative tasks. Although the candidate spots resulted from reasoning about a program may not always be trustworthy, they can provide useful information for the maintaining activities. Currently, our approach support only some of creational design patterns. To support structural and behavioral design patterns, we are developing the inference rules and refactoring strategies for them. Furthermore, in order to make the developed inference rules more robust, we continue to elaborate those inference rules by experimenting with various programs. In addition, we are developing a prototype tool for supporting our approach. After the development is finished, we plan to evaluate the tool in a practical context. 7. REFERENCES [1] W. F. Opdyke, Refactoring object-oriented fromeworks. Ph.D. thesis, Computer Sciences Department, University of Illinois at Urbana-Champaign, 1992. [2] S. Demeyer, S. Ducasse, and O. Nierstrasz, “Finding Refactorings via Change Metrics,” In Proceedings of the International Conference on Object-Oriented Programs, Systems, Languages, and Applications, pp. 166-177, ACM Press, 2000. [3] E. Gamma, R. Helm, R. Johnson, and J. Vlissides, Design Patterns: Elements of Reusable Object-Oriented Software. Addison-Wesley, 1995. [4] G. Florijn, M. Meijers, and P. van Winsen, “Tool Support in Design Patterns,” In Proceedings of European Conference on Object Oriented Programming, pp. 472-495, June 1997.
[5] M. Meijers, Tool Support for Object-Oriented Design Patterns, Masters thesis, Department of Computer Science, Rijksuniversiteit Utrecht, August 1996. [6] B. Schulz, T. Genssler, B. Mohr, and W. Zimmer, “On the Computer Aided Introduction of Design Patterns into Object-Oriented Systems,” In Proceedings of Technology of Object-Oriented Languages and Systems, IEEE Computer Society, 1998. [7] L. Tokuda and D. Batory, “Automated Software Evolution via Design Pattern Transformations,” In Proceedings of the 3rd International Symposium on Applied Corporate Computing, Monterrey, Mexico, October 1995. [8] L. Tokuda, Evolving Object-Oriented Design with Refactorings, PhD thesis, University of Texas at Austin, 1999. [9] M. O Cinneide and P. Nixon, “A Methodology for the Automated Introduction of Design Patterns,” In Proceedings of the IEEE International Conference on Software Maintenance, pp. 463-472, IEEE Computer Society, 1999. [10] M. O Cinneide, Automated Application of Design patterns : A Refactoring Approach, PhD thesis, University of Dublin, Trinity College, 2000. [11] T. Mens and T. Tourwe, “A Declarative Evolution Framework for Object-Oriented Design Patterns,” In Proceedings of the IEEE International Conference on Software Maintenance, IEEE Computer Society, 2001. [12] M. Fowler, Refactoring: improving the design of existing code, Addison-Wesley, 1999. [13] JavaCCTM , The Java Parser Generator, Available from : http://suntest.com/JavaCC. [14] E. Kontsofios and S. C. North, “Drawing Graphs With Dot,” AT&T Bell Laboratories, Murray Hill, NJ, Online at http://www.research.att.com/sw/tools/graphviz/ [15] K. Maruyama and K. Shima, “Automatic Method Refactoring Using Weighted Dependence Graphs,” In Proceedings of the 21st International Conference on Software Engineering, Los Angeles, pp. 236-245, May 1999. [16] F. Simon, F. Steinbruckner, and C. Lewerentz, “Metrics Based Refactoring,” In Proceedings of 5th European Conference on Software Maintenance and Reengineering, pp. 30-38, IEEE Computer Society, Los Alamitos, 2001.
[17] R. K. Keller, R. Schauer, S. Robitaille, and P. Pag´e, “Pattern-Based Reverse-Engineering of Design Components,” In Proceedings of the 21st International Conference on Software Engineering, IEEE Computer Society, Los Angeles, pp. 226-235, May 1999. [18] F. Shull, W. L. Melo, and V. R. Basili, An Inductive Method for Discovering Design Patterns from ObjectOriented Software Systems. Technical report, University of Maryland, Computer Science Department, College Park, MD, 20742 USA, October 1996. [19] C. Kramer and L. Prechelt, “Design Recovery by Automated Search for Structural Design Patterns in Object-Oriented Software,” In Proceedings of the Working Conference on Reverse Engineering, pp. 208215, 1996. [20] J. Seemann and J. W. von Gudenberg, “Pattern-Based Design Recovery of Java Software,” In Proceedings of Foundation of Software Engineering, pp. 10-16, November 1998. [21] D. Jackson and A Waingold, “Lightweight Extraction of Object Models from Bytecode,” In Proceedings of the 21st International Conference on Software Engineering, pp. 194-202, Los Angeles, May 1999. [22] L. Prechelt and W. F. Tichy,“A Controlled Experiment in Maintenance Comparing Design Patterns to Simpler Solutions,” IEEE Transactions on Software Engineering, pp. 1134-1144, vol 27, no. 12, 2001.