method call (e.g., method name, method signature, ... recognition in Java called EDPDetector4Java has been ...... Digital Design Principles and Practices.
Elemental Design Patterns Recognition in Java Francesca Arcelli, Stefano Masiero, Claudia Raibulet Università degli Studi di Milnao-Bicocca, DISCo – Dipartimento di Informatica, Sistemistica e Comunicazione {arcelli, masiero, raibulet}@disco.unimib.it
Abstract The decomposition of design patterns into simpler elements may reduce significantly the creation of variants in forward engineering, while it increases the possibility of identifying applied patterns in reverse engineering. Nevertheless, there are few reverse engineering tools that exploit the decomposition of patterns (i.e., FUJABA, SPQR). The SPQR approach introduces a catalog of elemental design patterns (EDP) and a rule set based on sigma-calculus through which EDPs are defined and composed into design patterns. Considering the SPQR approach particularly interesting, we propose a novel solution for defining and detecting EDPs and, further, design patterns. Our approach defines EDPs as logical functions of eight symbolic variables, each variable representing a method call (e.g., method name, method signature, method declaration, this reference, super reference) or a class property (superclass, same family, same object). An EDP detector has been developed based on this approach, representing a starting point for future developments towards design pattern recognition in the reverse engineering context.
1. Introduction The idea of decomposing design patterns into recurring elements has emerged both in the context of forward and reverse engineering. Such elements are called fragments [6], motifs [5], minipatterns [4], micro-patterns [29], sub-patterns [15, 16], or elemental design patterns [18, 21]. As the variety of the names suggests there is little agreement in what design patterns should be decomposed. One of the main benefits of decomposing design patterns into subcomponents is the reduction of the variants generation in forward engineering (variants of design patterns are translated into variants of their simpler subcomponents which are significantly less numerous and also less complex), and the increase of the rate of identifying applied patterns in reverse engineering.
Currently, there are several reverse engineering tools [13, 19, 30, 31, 32, 33] that consider design pattern detection. Although decomposing design patterns into sub-components may improve significantly their detection process and results, there are few tools (to the best of our knowledge) that exploit this approach: FUJABA (From UML to Java And Back Again) [15] and SPQR (System for Pattern Query and Recognition) [19]. The reason of decomposing design patterns into sub-components in the context of the two tools is different, however both obtain significant results. FUJABA is a forward and reverse engineering tool exploiting sub-patterns [16] to reduce the dimension of the design pattern catalog and the complexity of the elements searched in the source code, as well as to improve the detection algorithm. SPQR is an automatic tool for design pattern detection for C++. Elemental Design Patterns (EDPs), the sub-components of design patterns, play a central role in the context of SPQR. Extraction of information from source code is performed according to the elements EDPs are built of. The design pattern detection is reduced to the EDPs detection, while design patterns are expressed exclusively through EDPs. A detailed comparison of these two tools together with the advantages they provide is described in [2]. We consider the SPQR approach particularly interesting for the following reason: it asserts being the first automatic toolset able to identify precisely design patterns through a “highly formalized semantics” [19, 24] exploiting the sigma calculus [1]. Available documentation of SPQR defines a complete catalog of EDPs and how EDPs can be described though rhocalculus [22], a subset of sigma calculus enriched with reliance operators. However, SPQR is not available to be used for testing. The information provided by authors within their research papers and technical reports is not enough to rebuild the SPQR approach for the definition and detection of EDPs and design patterns. Moreover, in our opinion, SPQR is not easily comprehensible and extensible to add or modify an EDP or a design pattern.
The work presented in this paper starts from the EDPs catalog of SPQR and provides an alternative approach to detect EDPs from Java code carefully exploiting the particularities of the Java language. Based on our approach, a prototype tool for EDPs recognition in Java called EDPDetector4Java has been developed [12]. We proved EDPDetector4Java on available systems of significant dimension and complexity as JUnit [26] and JEdit [27]. The aim of the paper is to describe in detail the new approach for EDPs detection based on logical functions of eight symbolic variables, each variable representing a method call or a class property. Furthermore, we are studying the extension of the EDPDetector4Java towards the DPDetector4Java. The rest of the paper is organized as follows. Section 2 provides an analysis of EDPs and of the Java method and class properties we have identified for their description. Section 3 introduces the definition of the EDPs in Java, as well as a brief description of the tool prototype we have implemented. Conclusions and current work are dealt within Section 4.
detected when a class has an attribute whose type is of another class, and when it defines a method returning a value having a type of a third class. Inheritance indicates “the usage of all of another classes’ interface, and all or some of its implementation” [18]. In Java, this EDP is detected through the extends or implements keywords. More interesting issues are raised by the EDPs of the third group Method Invocation. Each EDP of this group is detected through two main types of information: (1) the relationship between the method (referrer) which calls another method (referred), and (2) the relationship between the object which contains the referrer (sender) and the object which contains the referred (receiver). The analysis of these two relationships is described in detail by the authors of SPQR in [20]. Their goal has been to generalize the analysis for object-oriented languages. Our aim is to particularize the analysis for the Java language. Hence, we propose a slightly modification of terms’ names occurring in the Method Calling Classification of the EDP catalog [18] to avoid ambiguities (see Table 1). We conserve their primarily semantic.
2. Analyzing EDPs The starting point of our work is represented by the GoF-like description [7] of the EDPs [18]. EDPs are divided in three groups: Object Element, which contains three elemental patterns dealing with the creation and definition of objects (Create Object, Abstract Interface, and Retrieve); Type Relation, which contains one elemental pattern describing the inheritance relationship; Method Invocation, which contains twelve elemental patterns describing the common method calls identified in the GoF catalog. The EDPs of the first two groups are trivial to detect based on a direct correspondence they have with the constructs of the Java language: Create Object has as intent “to ensure that newly allocated data structures conform to a set of assertions and preconditions before they are operated on by the rest of the system, and that can only be operated on in pre-defined ways” [18]. This EDP is detected in Java through the new keyword. Abstract Interface aims at providing “a common interface for operating on an object type family, but delaying definition of the actual operations to a later time” [18]. In Java, this EDP is detected through the abstract or interface keywords. Retrieve indicates “the usage of an object from another non-local source in the local scope, thereby creating a relationship and link between the local object and the remote one” [18]. In Java, this EDP is
Relationship
Original Name
Sender – Receiver Referrer – Referred
Other Same Same (declaration) Same (signature)
Modified Name Other Class Same Class Same Method Same Signature
Table 1. Modified Names
Each method invocation which characterizes the behavior of the objects defining an EDP is determined by both the relation between the objects and the relation between the methods involved in the interaction. Table 2 introduces the relationships specific to each EDP belonging to the Method Invocation group, as described in the SPQR approach and updated according to Table 1.
2.1. Identification of Relationships in Java
Method
Invocation
Starting from the definition of the EDPs through the two types of relationships presented in Table 2, we have identified the following eight methods’ and objects’ properties we can exploit in the Java language to define and detect all the twelve EDPs of the Method Invocation group. These properties are: Same Method Name: the two methods involved in the interaction (referrer and referred) have the same name, not necessarily the same formal
parameters; hence they may have different signatures; Same Method Signature: the two methods involved in the interaction (referrer and referred) have the same signature; in addition, they are not abstract; in this case the sender and the receiver are different classes, which may or may not be connected by an inheritance (direct or indirect) relationship; they may or may not have common supertypes; Same Method Declaration: the two methods involved in the interaction (referrer and referred) are actually one method; hence, the sender plays also the role of receiver; it is usual in the presence of recursion; Same CT (Compile-Time) Class: this property is related to the type of the receiver, the type of the sender being always known at run-time; the compile-time type of the receiver may not be the same as its run-time type; Superclass: the receiver is a superclass of the sender, not necessarily a direct superclass, but any superclass in the inheritance hierarchy; Same Family: the sender and the receiver are distinct classes and in addition they have at least one common superclass; this does not exclude that the receiver may be a subclass of the sender or vice-versa; Super Reference: the sender uses explicitly the keyword super to invoke the referred of the receiver; This Reference: the referrer uses explicitly the keyword this to invoke the referred; or the referred is invoked without any reference.
Elemental Design Patterns Delegate Redirect Conglomeration Recursion Revert Method Extend Method Delegated Conglomeration Redirected Recursion Delegate In Family Redirect In Family Delegate in Limited Family Redirect in Limited Family
SenderReceiver Relationship Other Class Other Class Self Self Super Super Same Class
ReferrerReferred Relationship Dissimilar Similar Different Same Declaration Dissimilar Same Signature Dissimilar
Same Class
Same Declaration
Parent Parent Sibling
Dissimilar Similar Dissimilar
Sibling
Similar
Table 2. EDPs Description through Their Definition Relationships
We have adopted an approach similar to the one used in combinatorial logic circuits, considering as inputs the eight methods’ and objects’ properties previously mentioned and as possible output values true and false expressing whether the property is verified or not. Notations, axioms, and theorems applied have been borrowed from the Shannon’s switching algebra [25]. The eight properties have been associated a symbolic variable A together with an index to identify the i-th feature (see Table 3). In our approach, each rule for identifying an EDP is represented as a logical function having as terms the symbolic variables. The total number of possible combinations is 28 = 256, but most of them can be excluded. Some properties under investigation are not independent of each other, hence they can be met only in the presence of other properties. The dependences have been identified considering the Java language specification [8] and the Java Virtual Machine specification from Sun Microsystems [10], as well as the types of information involved in the static analysis. Class/Method Property Same Method Name Same Method Signature Same Method Declaration Same CT Class Superclass Same Family Super Reference This Reference
Logical Variable A0 A1 A2 A3 A4 A5 A6 A7
Table 3. Property – Symbolic Variable Correspondence
2.1.1. Combinations to Exclude In addition to the notational conventions of the switching algebra, we have introduced “⊥” to indicate that a combination cannot be ever verified, and RL and RI to indicate restrictions generated by the Java language, and respectively restrictions generated by the type of the information extracted from the static analysis of source code. Same Method Name and Same Method Signature (A0, A1, RL) The two properties are strictly related between them. The Java language specification defines the signature of a method through the method name, the number of formal parameters, and the types of the formal parameters. Therefore, two methods with different names cannot have the same signature:
A0 ⋅ A1 =⊥ Same Method Name and Same Method Declaration (A0, A2, RL)
A method declaration corresponds to a single method name. Therefore, we exclude the case in which a method declaration is associated to two different method names:
A0 ⋅ A2 =⊥ All other cases are possible. There may exist two different declarations with the same name, whether they are in the same class or not. The compiler establishes which method to call based on its signature. This is typical for overloading. Same Method Signature and Same Method Declaration (A1, A2, RL) A class cannot define two methods with the same signature. The compiler would notify an error. Therefore, a method declaration cannot have two different signatures:
A1 ⋅ A2 =⊥ Two different methods, with different declarations, may have the same signature if they belong to two different classes. For example, sub-classes may redefine or re-implement the methods of their superclasses. This technique is known as overriding. Therefore, another case to be excluded specifies that two different methods with the same declaration cannot be defined in the same class:
A1 ⋅ A 2 ⋅ A3 =⊥ Moreover, this reference is also excluded:
A1 ⋅ A 2 ⋅ A7 =⊥ Same Method Declaration and Same CT Class (A2, A3, RI) When the referrer and the referred refer to the same method declaration, they automatically identify the same class that contains the method declaration. The sender and the receiver refer themselves to the same class. Therefore, we exclude the case in which a method declaration may belong to different classes:
A2 ⋅ A3 =⊥ Same Method Declaration and Superclass (A2, A4, R I) If referrer and referred refer to the same method declaration contained in the same class, the receiver cannot be a superclass of the sender:
A2 ⋅ A4 =⊥ Semantically, this case is similar to the previous one: two classes being involved in an inheritance relationship must be different. Same Method Declaration and Super Reference (A2, A6, RLRI) According to the Java language specification, the super keyword specifies that a method defined by the direct superclass of the sender is called. Therefore, we
exclude that the referrer and the referred can refer to the same method declaration when having super:
A2 ⋅ A6 =⊥ Same CT Class and Superclass (A3, A4, RI) The relationship between the sender and the receiver cannot be characterized by both these two properties in the same time: the superclass excludes the possibility of having the Same CT Class and viceversa:
A3 ⋅ A4 =⊥ Same CT Class and Super Reference (A3, A6, RLRI) The presence of the keyword super indicates that the referred method belongs to the superclass, therefore we exclude the possibility of having Same CT Class and Super Reference contemporaneously:
A3 ⋅ A6 =⊥ Same CT Class and This Reference (A3, A7, RLRI) According to the Java language specification, this indicates a reference to the object itself. Therefore, we exclude the case in which sender and receiver are two different classes in the presence of the Same CT Class and This Reference:
A3 ⋅ A7 =⊥ Superclass and Super Reference (A4, A6, RLRl) The presence of the super keyword indicates that the receiver is the direct superclass of the sender. Therefore, we exclude the case in which the receiver is not a superclass of the sender when having super:
A 4 ⋅ A6 =⊥ Superclass and This Reference (A4, A7, RLRI) The presence of this indicates that the sender and the receiver are the same object. Therefore, we exclude the possibility of having Superclass and This Reference contemporaneously:
A4 ⋅ A7 =⊥ Same Family and Super Reference (A5, A6, RLRI) All the combinations between these two properties are allowed. However, the explicit use of super leads to the conclusion that the receiver is a direct superclass of sender. There are two possible situations: the receiver has no superclasses, therefore the Same Family property is not verified; the receiver has at least one superclass, which becomes the common superclass of both the sender and the receiver; in this case the Same Family property is verified. Due to this ambiguity, the presence of a Super Reference excludes the possibility to have the Same Family property. Same Family and This Reference (A5, A6, RLRI)
The presence of this implies that the sender and receiver are of the same class. The definition of the Same Family property excludes the possibility of having the sender and the receiver of the same class (otherwise the definition of the Same CT Class is redundant). Thus, we exclude the following combination:
A5 ⋅ A7 =⊥ Super Reference and This Reference (A6, A7, RI) We exclude the possibility of having both a super and this reference contemporaneously for the same method call:
There is another case when the Self relationship is verified: in the presence of this keyword:
( A3 + A4 ) ⋅ A6 + A7 = A3 ⋅ A6 + A4 ⋅ A6 + A7 Applying the restriction A3 ⋅ A6 =⊥ we obtain: Self = A3 + A4 ⋅ A6 + A7 Super indicates that there is an explicit use of the super keyword, therefore the relationship Super is verified when: Super = A6
A6 ⋅ A7 =⊥ Excluding the restrictions previously described we obtain all the possible combinations through which EDPs are represented (see Table 4). We have reduced the possible combinations from 256 to 27. Each EDP is described by one or more combinations present in Table 4.
2.2. Definition of Relationships in Java
Method
Invocation
Before introducing the EDPs definition functions, we describe how each of the relationships present in a method invocation (see Table 2) is expressed through these eight method call properties. Other Class indicates that the sender and the receiver have distinct types, hence belonging to different hierarchies of types. Therefore, we exclude the possibility of having Other Class when we have at least one on the following situations: Same CT Class, Superclass, Same Family, Super Reference, or This Reference:
A3 + A4 + A5 + A6 + A7 In all other cases, Other Class is verified:
A3 + A4 + A5 + A6 + A7 = A3 ⋅ A 4 ⋅ A5 ⋅ A 6 ⋅ A 7 Considering the two restrictions
A 4 ⋅ A6 =⊥ and A3 ⋅ A7 =⊥ we obtain: Other Class = A3 ⋅ A 4 ⋅ A5 Self indicates that the sender and the receiver represent the same instance. Considering the polymorphism, we exclude the possibility of having Self when we have at least one of the following situations: not Same CT Class and not Superclass, or Super Reference:
( A3 ⋅ A 4 ) + A6 The Self relationship may be verified when:
( A3 ⋅ A4 ) + A6 = ( A3 + A4 ) ⋅ A6
Table 4. Possible Combinations Same Class indicates that the sender and the receiver are two different instances of the same class. Considering the ambiguities generated by the polymorphism, we exclude the cases in which no Same Class relationship is verified: not Same CT Class and not Superclass, Super Reference, or This Reference:
A3 ⋅ A4 + A6 + A7 In all other cases the relationship is verified:
A3 ⋅ A4 + A6 + A7 = ( A3 + A4 ) ⋅ A6 ⋅ A7
The Same Class relationship is given by the following expression: Same Class = A3 ⋅ A6 ⋅ A7 + A4 ⋅ A6 ⋅ A7 Parent indicates that the receiver is a subclass of the sender. Considering the ambiguities generated by the polymorphism, we exclude the cases in which no Parent relationship is verified: Same CT Class, not Superclass, Super Reference, or This Reference:
A3 + A 4 + A6 + A7 In all other classes, the relationship Parent is verified when:
A3 + A4 + A6 + A7 = A3 ⋅ A4 ⋅ A6 ⋅ A7 Parent = A4 ⋅ A6 Sibling indicates that the sender and the receiver have a common superclass, although they are not in a Parent relationship. Therefore, the following situations should be contemporaneously verified to have a Sibling relationship: not Same CT Class, not Superclass, Same Family, not Super Reference, and not This Reference:
A3 ⋅ A4 ⋅ A5 ⋅ A6 ⋅ A7 Applying
the
restrictions
and
A 4 ⋅ A6 =⊥ we obtain: Sibling = A3 ⋅ A4 ⋅ A5 Dissimilar indicates that the referrer and referred have different names. We exclude the possibility of having a Dissimilar relationship when we have at least one of the following situations: Same Method Name, Same Method Signature, or Same Method Declaration:
A0 + A1 + A2 Applying
the
restrictions
A0 ⋅ A1 =⊥ and
A0 ⋅ A2 =⊥ we obtain: Dissimilar = A0 Similar indicates that the referrer and the referred have the same name, but they do not represent the same method. Therefore, we assume that the Same Method Name relationship should be verified, while the Same Method Declaration should not be verified to have a Similar relationship: Similar = A0 ⋅ A2
A2 + A1 ⋅ A3 + A1 ⋅ A7 The relationship Different is verified when:
A2 + A1 ⋅ A3 + A1 ⋅ A7 = A2 ⋅ ( A1 + A3 ) ⋅ ( A1 + A7 ) Applying
the
restrictions
A1 ⋅ A1 =⊥ and
A3 ⋅ A3 =⊥ we obtain:
Applying the restriction A3 ⋅ A4 =⊥ we obtain:
A3 ⋅ A7 =⊥
Different indicates that the referrer and referred are two different methods. Considering the polymorphism, we exclude the possibility of having different methods when we have at least one of the following situations: Same Method Declaration, Same Method Signature and Same CT Class, or Same Method Signature and This Reference:
Different = A1 + A3 There are four additional cases in which we can assert that the Different relationship is verified: not Same Signature, not Same Method Declaration and not Superclass, not Same CT Class and Superclass, or Super Reference. Adding these possible cases we obtain:
A1 + A3 + A1 + A2 ⋅ A4 + A3 ⋅ A4 + A6 Finally, the expression for Different is represented as follows: Different = A1 + A2 ⋅ A4 + A3 + A6 Same Method indicates that the referrer and the referred are actually the same method. From the definition point of view, this property is complementary to Different. Its complete expression is given by the union of the following: we exclude having Same Method when Different is verified:
A1 + A2 ⋅ A4 + A3 ⋅ A4 + A6 = A2 + A1 ⋅ A4 ⋅ A6 we consider having Same Method when Different is not verified:
A2 + A1 ⋅ A3 + A1 ⋅ A7 The final form of the expression is: SameMethod = A2 + A1 ⋅ A3 + A1 ⋅ A7 + A1 ⋅ A4 ⋅ A6 Same Signature indicates that the referrer and the referred have the same signature. This is expressed through the Same Method Signature property: Same Signature = A1 Table 5 summarizes the expressions related to all the classes and methods properties necessary to recognize EDPs in Java:
3. Definition of EDPs in Java The next step consists in combining the class/objects and methods properties (presented in Table 5) to obtain the functions describing EDPs. Then, we associate EDP functions with possible combinations of the eight properties considered when detecting EDPs in Java systems (see Table 4). Therefore, EDPs functions are expressed in terms of canonical sums of product forms of the eight Java specific properties:
F = ∑ A ,..., A (mk ,..., mk +i −1 ) 0
7
where k , i ∈ {0,...,2 } Actually, m0, m1, … corresponds to the first column in Table 4. Delegate parcels out “a portion of the current work to another method in another object” [18]. It is characterized by the Other Class and Dissimilar properties: n
A3 ⋅ A 4 ⋅ A5 ⋅ A0 The possible combinations available in the Table 4 and corresponding to this function are: Delegate = Class/Object Property
∑
A0 ,..., A7
(0)
Recognition Rule
Other Class
A3 ⋅ A 4 ⋅ A5
Same Class
A3 ⋅ A6 ⋅ A7 + A4 ⋅ A6 ⋅ A7
Self
A3 + A4 ⋅ A6 + A7
Super
A6
Parent
A4 ⋅ A6
Sibling Method Property
A3 ⋅ A4 ⋅ A5 Recognition Rule
Dissimilar
A0
Similar
A0 ⋅ A2
Different
A1 + A2 ⋅ A4 + A3 + A6
Same Method Same Signature
A2 + A1 ⋅ A3 + A1 ⋅ A7 + A1 ⋅ A4 ⋅ A6 A1
Table 5. Recognition Rules for Classes/Objects and Methods Properties
Redirect requests that “another object performs a tightly related subtask to the task at hand, perhaps performing the basic work” [18]. It is characterized by the Other Class and Similar properties:
A3 ⋅ A 4 ⋅ A5 ⋅ A0 ⋅ A2 The possible combinations available in the Table 4 and corresponding to this function are:
∑
(1,3)
A0 ,..., A7 Redirect = Conglomeration brings together “diverse operations and behaviors to complete a more complex task within a single object” [18]. It is characterized by the Self and Different properties:
( A3 + A4 ⋅ A6 + A7 )-( A1 + A2 ⋅ A4 + A3 + A6 ) The possible combinations available in the Table 4 and corresponding to this function are: Conglomeration =
∑
A0 ,..., A7
(8,9,16,17,19,40,41,48,49,51,136,137)
Recursion aims “to accomplish a larger task by performing many smaller similar tasks, using the same object state” [18]. It is characterized by the Self and Same Method properties:
( A3 + A4 ⋅ A6 + A7 ) ⋅ ( A2 + A1 ⋅ A3 + A1 ⋅ A7 + A1 ⋅ A4 ⋅ A6 ) The possible combinations available in the Table 4 and corresponding to this function are:
∑
(15,19,47,51,143)
A0 ,..., A7 Recursion = Revert Method aims at “bypassing the current class implementation of a method, and instead uses the superclass implementation, reverting to an earlier method body” [18]. It is characterized by the Super and Dissimilar properties:
A6 ⋅ A0 The possible combinations available in the Table 4 and corresponding to this function are:
∑
(80,112)
A0 ,..., A7 Revert Method = Extend Method “adds to (not replaces) behavior in a method of a superclass while reusing existing code” [18]. It is characterized by the Super and Same Signature properties:
A6 ⋅ A1 The possible combinations available in the Table 4 and corresponding to this function are:
∑
(83,115)
A0 ,..., A7 Extend Method = Delegated Conglomeration claims for “a distinct instance of an object type in the Conglomeration pattern, hence requiring a Delegate” [18]. It is
characterized by the Same Class and Dissimilar properties:
for polymorphism” [18]. It is characterized by the Sibling and Similar properties:
( A3 ⋅ A6 ⋅ A7 + A4 ⋅ A6 ⋅ A7 ) ⋅ A0 = A0 ⋅ A6 ⋅ A7 ⋅ ( A3 + A4 ) The possible combinations available in the Table 4 and corresponding to this function are:
∑
(8,16,40,48)
A0 ,..., A7 Delegated Conglomeration = Redirected Recursion performs “a recursive method, requiring interaction with multiple objects of the same type” [18]. It is characterized by the Same Class and Same Method properties:
( A3 ⋅ A6 ⋅ A7 + A4 ⋅ A6 ⋅ A7 ) ⋅ ( A2 + A1 ⋅ A3 + A1 ⋅ A7 + A1 ⋅ A4 ⋅ A7 )
The possible combinations available in the Table 4 and corresponding to this function are:
∑
(15,19,47,51)
A0 ,..., A7 Redirected Recursion = Delegate in Family describes the case in which “related classes are defined to perform tasks collectively. In such cases multiple objects of related types can interact in generalized ways to delegate tasks to one another” [18]. It is characterized by the Parent and Dissimilar properties:
A4 ⋅ A6 ⋅ A0 The possible combinations available in the Table 4 and corresponding to this function are: Delegate in Family =
∑
A0 ,..., A7
(16,48)
Redirect in Family “redirects some portion of a method implementation to a possible cluster of classes, of which the current class is a member” [18]. It is characterized by the Parent and Similar properties:
A4 ⋅ A6 ⋅ A0 ⋅ A2 The possible combinations available in the Table 4 and corresponding to this function are:
∑
(17,19,49,51)
A0 ,..., A7 Redirect in Family = Delegate in Limited Family is used when “Delegate in Family is too generalized, and it is necessary to pre-select a sub-tree of the class hierarchy for polymorphism” [18]. It is characterized by the Sibling and Dissimilar properties:
A3 ⋅ A4 ⋅ A5 ⋅ A0 The possible combinations available in the Table 4 and corresponding to this function are:
∑
(32)
A0 ,..., A7 Delegate in Limited Family = Redirect in Limited Family is used when “Redirect in Family is too generalized, and it is necessary to pre-select a sub-tree of the class hierarchy
A3 ⋅ A4 ⋅ A5 ⋅ A0 ⋅ A2
The possible combinations available in the Table 4 and corresponding to this function are: Redirect in Limited Family =
∑
A0 ,..., A7
(33,35)
3.1. Adapt Method – A New EDP? Two of the possible combinations in Table 4, 81 and 113, have not been associated to any EDP defined by the SPQR approach. A key question raises here: is the EDP catalog complete? SPQR authors assert that the catalog defines all the “necessary, if not sufficient set of design patterns for object-oriented programming from which all other design patterns can be built and composed” [18]. Hence, they leave the possibility of adding new EDPs to their catalog. We provide a first name – Adapt Method [12] and a first definition for this new EDP which aims at adapting the interface of a method defined in a superclass and reusing its original implementation through a new adapted interface. This EDP is similar to the Adapt Interface [14] or Narrow Interface [13] patterns. We consider it a particular case of these patterns in that it focuses on a single method.
3.2. Detecting EDPs in Java EDPdetector4Java is the name of the prototype we have developed to test our approach. It uses the RECODER framework [17] to parse the source code and to extract and represent the information we need to detect EDPs. There are three requirements imposed by RECODER to analyze a Java system: (1) code should be compile-able meaning it does not contain syntactical errors, (2) the system under analysis is a closed world in that all references are known, and (3) RECODER supports only static analysis, in that it can read byte code but it cannot execute it. These requirements introduce limitations in the current version of our EDPdetector4Java. The information extracted from the source code is stored in an XML file, which is further managed through JAXB (Java Architecture for XML Binding) [11]. Also the configuration file of our prototype containing the definition of the eight method call properties is an XML file. Our approach uses an Abstract Syntax Tree (AST) representation of the source code in conjunction with information related to type expressions, reference resolutions, and cross-reference. To detect EDPs we have defined a hierarchy of Visitors [7]. Each EDP is identified through one Visitor [12]. The information
related to the detection of an EDP is currently stored in an XML file, which is easier to interpret and, moreover, to check the correctness of its information. We mention two of the possible alternatives to this approach: (1) annotating the AST with information revealing the presence of an EDP, or (2) storing the information related to EDPs in a database for further queries that may be related to the recognition of design patterns, or to the extraction of metrics measuring the code quality. Using a Visitor for each EDP makes EDPDetector4Java an interactive and semi-automatic tool. The user may choose to detect only one EDP or a subset of EDPs and furthermore, to specify which are the Java properties to be considered in the recognition process. Hence, we consider that EDPDetector4Java may be very useful to provide a first abstraction of the source code that cannot be easily identified by a simple analysis of the code, even if the tool does not recognize design pattern yet. Moreover, a version of the EDPdetector4Java as a plug-in for Eclipse has been developed.
4. Conclusions and Current Work This paper has presented a new approach for the definition and, hence, recognition of EDPs in Java. We consider very useful the recognition of EDPs because they are not only another way to represent source code information, but they capture also design intents such as delegation of implementation, abstraction of interface, and so on. For this reason they have been introduced in courses related to the object-oriented design and implementation [19]. EDPs address the variants problem, too: variants impact mainly on EDPs which are simpler than design patterns. We cannot provide a comparison between our EDPDetector4Java and SPQR because, currently, SPQR is not available to be used for design pattern detection. Primarily, we focused our work on the EDPs recognition from Java source code following exactly the SPQR approach. Hence, we extracted the required information from Java source code, expressed it as an Object XML according to the Pattern Object Modeling Language [23] specification (and called it J2POML [28]), and gave the results in input to the OTTER theorem prover used by SPQR for design pattern detection [19]. Afterwards, we decided to develop our tool based on a new approach exploiting only the Java language features. In our solution, the first four EDPs are easy to define and identify having a direct correspondence with the Java constructs. Eight method call properties have been identified to define the twelve EDPs belonging to the Method
Invocation group. The summary of our results related to the EDPs of the Method Invocation group is: two EDPs are uniquely identified through a single combination of the eight method call properties (Delegate, and Delegate In Limited Family); five EDPs are identified through two combinations (Redirect, Revert Method, Extend Method, Delegate In Family, and Redirect in Family); three EDP are identified through four combinations (Delegated Conglomeration, Redirected Recursion, and Redirect In Family) one EDP is identified through five combinations (Recursion); one EDP is identified through twelve combinations (Conglomeration). Our analysis of EDPs and their definition in Java has lead to the identification of a possible new EDP, called Adapt Method. Current work follows two directions: (1) introducing dynamic analysis of the source code or, alternatively, identifying additional Java properties in order to uniquely identify each EDP through one combination of Java properties, and (2) identifying design patterns through combinations of EDPs. For the first objective we are studying the CAFFEINE [9] approach. While, for the second, we gave in input to the EDPDetector4Java each design pattern to identify the EDPs it is composed of. It resulted that different design patterns are composed of the same set of EDPs. The difference consists in the way EDPs are combined. Therefore, a solution to this problem may be the identification of new Java properties we could exploit to precisely recognize design patterns through EDPs combinations. Further, we implemented the EDPs catalog into FUJABA (EDP4FUJABA) [3] to perform a deep comparison between the sub-patterns of FUJABA and the EDPs, comparison which may lead to a unified design patterns’ sub-components catalog.
Acknowledgements We would like to thank Jason McC Smith from University of North Carolina, USA for his useful hints and advices during this work.
5. References [1] M. Abadi, and L. Cardelli, A Theory of Objects, SpringerVerlag, New York, Inc., 1996. [2] F. Arcelli, S. Masiero, C. Raibulet, and F. Tisato. A Comparison of Reverse Engineering Tools based on Design Pattern Decomposition. In Proceedings of the Australian Software Engineering Conference, Brisbane, Australia, March, 28th-31st, 2005, pp. 262-269
[3] D. De Bortoli, and L. Conti. Recognition of Elemental Design Patterns in FUJABA. BsC Thesis, University of Milano-Bicocca, Milan, Italy, April, 2005 [4] M. Ò Cinnéide. Automated Application of Design Patterns: A Refactoring Approach. Ph.D Dissertation, University of Dublin, Trinity College, 2001 [5] A. H. Eden. Precise Specification of Design Patterns and Tool Support in Their Application. PhD Dissertation. Department of Computer Science, Tel Aviv University, 2000 [6] G. Florijn, M. Meijers, and P. van Winsen. Tool Support for Object Oriented Patterns. In Proceedings of the 11th European Conference on Object-Oriented Programming, Springer Verlag, Berlin, Germany, 1997 [7] E. Gamma, R. Helm, R. Johnson, and J. Vlissides, Design Patterns: elements of reusable object-oriented software, Addison Wesley, Reading MA, USA, 1994 [8] J. Gosling, B. Joy, G. Steele, and G. Bracha. The JavaTM Language Specification. Sun Mycrosystems, Inc., Second Edition, 2000 [9] G. Y. Gueheneuc, R. Douence, and N. Jussien. No Java without Caffeine: A Tool for Dynamic Analysis of Java Programs. In Proceedings of the 17th IEEE International Conference on Automated Software Engineering, Semptember, 2002, pp. 117-126
[20] J. McC Smith, and D. Stotts. Elemental Design Patterns: A Link Between Architecture and Object Semantics. In Technical Report TR02-011, University of North Carolina at Chapel Hill, USA, March 25th, 2002. [21] J. McC. Smith, and D. Stotts. Elemental Design Patterns: A Formal Semantics for Composition of OO Software Architecture. In Proceedings of the 27th Annual IEEE/NASA Software Engineering Laboratory Workshop, Greenbelt, MD, 2002, pp. 183-190. [22] J. McC. Smith, and D. Stotts. Elemental Design Patterns and the Rho-Calculus: Foundations for Automated Design Pattern Detection in SPQR. In Technical Report 03-032, Computer Science Department, University of North Carolina at Chapel Hill, September 2003. [23] J. McC. Smith, Pattern/Object Markup Language (POML): A Simple XML Schema for Object Oriented Code Description. University of North Carolina at Chapel Hill, April 2004. [24] J. McC. Smith, and D. Stotts. SPQR: Formalized Design Pattern Detection and Software Architecture Analysis. In Technical Report 05-012, Computer Science Department, University of North Carolina at Chapel Hill, May 2005. [25] J.F. Wakerly. Digital Design Principles and Practices. Prentice-Hall, Third Edition, 2000.
[10] T. Lindholm and F. Yellin. The JavaTM Virtual Machine Specification. Sun Microsystems, Inc., Second Edition, 1999.
[26] JUnit - www.junit.org
[11] JAXB - http://java.sun.com/xml/jaxb/
[28] D. Bellinzona, J2POML: Extraction of Information for Design Pattern Recognition from Java Source Code. University of Milano-Bicocca, Milan, Italy, November, 2004.
[12] S. Masiero. Design Pattern Detection in Reverse Engineering – The Role of Sub-Patterns. Master Thesis, University of Milano-Bicocca, Milan, Italy, October, 2004 [13] Narrow The Interface http://c2.com/cgi/wiki?NarrowTheInterface
-
[14] Kerievsky. Refactoring to Patterns. Addison Wesley, 2004
[27] JEdit – www.jedit.org
[29] J. Y., Gil, and I. Maman: Micro Patterns in Java Code. In Proceedings of the 20 th Annual ACM SIGPLAN Conference on Object Oriented Programming Systems Languages and Applications, 2005, pp. 97-116
[15] U. Nickel, J. Niere, and A. Zündorf,. The FUJABA Environment. In Proceedings of the 22nd International Conference on Software Engineering, Limerick, Ireland, 2000, pp. 742-745.
[30] H. Albin-Amiot, P. Cointe, Y. G. Guéhéneuc, and N. Jussien, “Instantiating and Detecting Design Patterns: Putting Bits and Pieces Together”, Proceedings of the 16th International Conference on Automated Software Engineering, San Diego, CA, USA, 2001, pp. 166-173.
[16] J. Niere, W. Schäfer, J. P. Wadsack, L. Wendehals, and J. Welsh. Towards Pattern-Based Design Recovery. In Proceedings of the 24th International Conference on Software Engineering, Orlando, Florida, USA, 2002, pp. 338-348.
[31] D. Beyer, and C. Lewerentz, “CrocoPat: Efficient Pattern Analysis in Object-Oriented Programs”, Proceedings of the 11th IEEE International Workshop on Program Comprehension, Los Alamitos, CA, USA, 2003, pp. 294295.
[17] Recoder - http://recoder.sourceforge.net/ [18] J. McC Smith. An Elemental Design Pattern Catalog. In Technical Report TR02-040, University of North Carolina at Chapel Hill, USA, December 10th, 2002.
[32] I. Philippow, D. Streitferdt, M. Riebisch, and S. Naumann, “An Approach for Reverse Engineering of Design Pattern”, Software and Systems Modeling, Springer Verlag, April 2004.
[19] J. McC. Smith, and D. Stotts. SPQR: Flexible Automated Design Pattern Extraction From Source Code. In Proceedings of the 2003 IEEE International Conference on Automated Software Engineering, Montreal QC, Canada, October, 2003, pp. 215-224
[33] R. K. Keller, R. Schauer, S. Robitaille, and P. Page, “Pattern-Based Reverse-Engineering of Design Components”, Proceedings of the International Conference on Software Engineering, Los Angeles, CA, USA, 1999, pp. 226-235.