problem patterns in input models, and marking the appropriate entities so that the appropriate ..... design patterns: the visitor, bridge, composite, and strategy.
Detecting Patterns of Poor Design Solutions Using Constraint Propagation Ghizlane El-Boussaidi1 and Hafedh Mili1 1
Laboratoire de Recherches en Technologies du Commerce Électronique (LATECE) Faculté des Sciences, Université du Québec à Montréal B.P 8888, succursale Centre-Ville, Montréal (Québec) H3C 3P8, Canada {el_boussaidi.ghizlane@courrier., hafedh.mili@}uqam.ca
Abstract. We are proposing an approach for applying design patterns that consists of recognizing occurrences of the modeling problem solved by the design pattern (problem pattern) in input models, which are then transformed according to the solution proposed by the design pattern (solution pattern). In this paper, we look at the issue of identifying instances of problem patterns in input models, and marking the appropriate entities so that the appropriate transformations can be applied. Model marking within the context of MDA is a notoriously difficult problem, in part because of the structural complexity of the patterns that we look for, and in part because of the required design knowledge- and expertise. Our representation of design problem patterns makes it relatively easy to express the pattern matching problem as a constraint satisfaction problem. In this paper, we present our representation of design problem patterns, show how matching such patterns can be expressed as a constraint satisfaction problem, and present an implementation using ILOG JSolver, a commercial CSP solver. Keywords: marking models, constraint satisfaction problems, transformations, design patterns.
1 Introduction Earlier transformational approaches described software development as a sequence of property-preserving transformations that are applied to a set of user requirements to produce functional software that satisfies a number of quality requirements [14]. Design choices made during the development process lead to different transformation chains. This idea was recently restored by the OMG' model driven architecture [11]. The basic idea behind MDA is to separate the specification of a system from its implementation on a specific platform. MDA proposes to specify a system as a platform-independent model (PIM), to specify platform models—models describing platforms—and, to transform the PIM to a platform-specific model (PSM) according to the chosen platform. Roughly speaking, the transformation from a PIM to a PSM is guided by a specification that provides the mapping between their entities. Generally these mappings are specified using a combination of model type mapping—defined
on a meta-model level—and model instance mapping which uses a marking process where—depending on the target platform—the architect marks entities of the PIM. In practice, the transformations are not fully automated because the mapping between the source and target models is incomplete or non-deterministic [13] and, the marking process remains essentially a manual process. Consequently, a major challenge in MDA is: 1) to specify these mappings precisely to be able to encode them in systematic procedures, and 2) to support an automatic or semi-automatic marking process to guide these procedures when applied to other requirements. In this paper, we propose an approach for automatically marking models especially in the context of design patterns application. This approach consists of recognizing in input models instances of the design problem solved by a design pattern (problem pattern), which are then transformed according to the solution proposed by the design pattern (solution pattern). In this paper, we look at the issue of identifying instances of problem patterns in input models, and marking the appropriate entities so that the appropriate transformations can be applied. Our problem patterns representation enables to express the pattern matching problem as a constraint satisfaction problem. The paper is structured as follows. Section 2 gives an overview of our approach to representing and applying design patterns. We describe the marking process in section 3. In sections 4 and 5 we show how the pattern matching problem is expressed as a constraint satisfaction problem. The implementation is presented in section 6 followed by a discussion in section 7 and the conclusion.
2 Overview of the Approach Design patterns codify proven solutions to recurrent design problems. Design problems come in many shapes and forms, but a significant number of such problems can be expressed as poor solutions to modeling requirements. We represent such design patterns by triples (MP, MS, T) (see Fig. 1), where MP is a model describing the design problem solved by the pattern, MS is a model describing the solution proposed by the pattern, and T a transformation that transforms an input model exhibiting an instance of MP by replacing that instance with the corresponding solution. Both MP and MS are metamodels to the extent that their instances are models. We envision a catalog of such triples in a modeling workbench. Given an input model, a designer can try to detect potential instances of the problems solved by the patterns of the catalog in this model, i.e. matching the MP components of the patterns, to the input model. A successful match marks the entities of the input model by the roles they play in the model of the problem (MP), and suggests that the design pattern may be applicable to this model. The designer may then apply the transformation rules to the so marked model. The outcome is the input model where an instance of the design problem addressed by the design pattern has been replaced by the solution proposed by the pattern. In this paper we describe the matching and marking process.
Representing patterns
Applying patterns
Problem (meta)models (MP)
Matching and marking
Input model
Transformation rules (T)
Solution (meta)models (MS)
Applying transformation
Marked model Problem instances
Transformed model Solution instances
Fig. 1. Overview of the approach
Consider the example of a class hierarchy that represents a taxonomy of classes that implement the same behavior—same set of methods. A proverbial example of such a hierarchy is the class hierarchy representing the node types of an abstract syntax tree for some language X. Assume that the set of node types is stable—new constructs seldom get added to language X—but that new methods are regularly added to manipulate our abstract syntax trees in different ways. With the traditional organization, each time a new behavior is added (e.g. a method to pretty-print an expression in the language) we have to, potentially, change every class of the hierarchy by adding a node-specific implementation of that operation. The visitor design pattern proposes to make these changes more localized by grouping all of the implementations of a given operation in a single visitor object [7]. We now show how we represent the design problem solved by the visitor using our approach. Figure 2 shows a model of class hierarchies that implement the same behavior. The root of the hierarchy is represented by the metaclass AbstractClass which implements a number of AbstractOperation’s (the ‘has_message’ link). This class has a number of concrete subclasses (ConcreteClass) which implement all of its AbstractOperation’s—and possibly more—with ConcreteOperation’s. To reflect the fact that every ConcreteClass has one ConcreteOperation (has_method) for each AbstractOperation supported by (has_message) its abstract superclass (AbstractClass), we use the homomorphism relation. To keep the model simple, we do not show the parameters and return types of the operations. We call this model the static problem model. This, alone, does not represent typical situations where the visitor pattern applies; in fact, it represents a properly factored class hierarchy! What is missing is the kind of future evolution of the class hierarchy that the visitor pattern protects against, and that is, the number of methods defined by AbstractClass and implemented by its ConcreteSubclass’es. In fact, most of GOF patterns can be understood in terms of making models more resilient to certain kinds of changes, and it is important to capture those changes. To this end, we extended our UML-based representation of problem models to represent time evolution. In particular, we have been able to represent all of the changes we came across in terms of changing (meta-level)
association cardinalities. For example, if an abstract class can specify a changing (increasing) number of operations, the “has-message” association between AbstractClass and AbstractOperation will have a cardinality of 1 ↔ 0..*,++, where we added the “++” qualifier to indicate that cardinality will increase over time. Figure 3 shows the new problem model that shows time evolution. The evolving association end is called time hotspot. For further details about our language for modeling design problems see [12]. AbstractClass
Has_message def_class
0..* message
interface
super_class Inherits_from
Implements
{Homomorphism with}
super_class
Inherits_from
AbstractOperation
sub_class 1..* imp_class ConcreteClass Has_method
0..*
overridden implementation 1..* ConcreteOperation Overrides
method 0..*
0..*
redefinition
sub_class
Fig. 2. The static problem model of the Visitor pattern
AbstractClass
Has_message def_class
0..* ++
interface
super_class Inherits_from
{Homomorphism with}
super_class
Inherits_from
sub_class 1..* imp_class ConcreteClass Has_method 0..* sub_class
AbstractOperation
message
Implements
overridden implementation 1..* 0..* ++ ConcreteOperation Overrides method 0..* redefinition
Fig. 3. The (complete) problem model = the static problem model + time hotspots
3 The Marking Process The marking process is the process of recognizing instances of problem models in a given input model. As shown from the previous example, a problem model consists of a static component (Figure 2), which can be verified statically on input models, and evolution information (time hotspots), which is not implicit in input models, and will need to be somehow added to the input models. Accordingly, we break the problem model matching into three steps (Fig. 4).
Designer
Problem models
Pattern matching using CSPs (1)
Input model
Input model
Configurations compliant with static problem models
Evolution data input (2)
Design patterns specific marking (3)
Input model Problem model instances
Marked model Marked problem model instances
Fig. 4. The marking process
1. Static problem model matching: In this step we try to recognize in a given input model, structures that comply with some static problem models. This is basically a graph pattern matching problem, since models are typed graphs. To this end, we use constraint satisfaction techniques, explained in sections 4 and 5. 2. Evolution data input: a static problem model match does not mean that the corresponding design pattern applies. In fact, a model such as the one in Figure 2 characterizes several design patterns other than visitor (e.g. strategy and bridge). Evolution data can be input in one of two ways. If we have different versions of the system at hand, we can compare the class models of the different versions to see which parts have evolved between versions (see e.g. [16]). Alternatively, for new systems or for systems for which historical data is not available, we can prompt designers to identify likely variation points (e.g., the number of subclasses of a given class, the number of methods, etc.). Naturally, the accuracy of such variation points depends on the level of expertise of the designer and her/his experience with the problem domain. 3. Full problem model marking: In this step, the variation points—the time hotspots—complement the static problem match to ascertain the compliance of a fragment from the input model with the full problem model of a design pattern. Once such a fragment is found, we mark its elements (classes, methods, attributes, associations, inheritance relationships, etc.) with the roles they play in the pattern problem model. For the visitor problem model, a class will be tagged as AbstractClass, its subclasses tagged as ConcreteClass’s, its abstract methods tagged as AbstractOperation, etc. Because different problem models may refer to the same roles (e.g. the notion of AbstractClass appears in many problem models), we need to qualify the roles by the pattern to which they refer. Thus, the same abstract class in the input model might be marked as Visitor.AbstractClass and Strategy.AbstractClass. Because the same input model may include different instances of the same problem model, the marks are further qualified by pattern instance number. Thus, a concrete class might be tagged as Strategy.1.ConcreteClass, Strategy.2.ConcreteClass, and Visitor.1.ConcreteClass. These tags are used by our transformation rules to make sure that different instances of design patterns do not interfere with each other.
4 Using CSPs for Pattern matching Models are graphs and recognizing a problem model in a given input model can be considered as a graph pattern matching problem. The latter is also known as the subgraph homomorphism problem, and was shown to be NP-complete [15]. Consider the graphs L=(LV, LE) and G=(GV, GE) shown in Fig. 5 where LV and GV represent the sets of vertices of L and G, respectively, and LE and GE represent their respective sets of edges. An homomorphism from L to G is a mapping M from LV to GV such that for each edge (A, B) in LE, (M(A), M(B)) is an edge in GE. L
G
A C
M(A)
M M(C)
B M(B)
Fig. 5. A simple homomorphism example
The graph homomorphism problem can be reformulated and solved as a constraint satisfaction problem [15]. First, we summarize the basic concepts of CSPs theory and then introduce an approach to solve the graph matching problem using CSPs. 4.1 CSP: Basic Concepts A constraint satisfaction problem (CSP) is a problem that can be framed as finding values for a set of variables, within particular domains, under some constraints. Precisely, a CSP consists of [15] [3]: A finite set of variables V={v1,…,vn} Their corresponding domains (ranges of values); let Di be the domain of the variable vi, A set R of constraints, which is a finite set of restrictions on the variables of V. A constraint C ∈ R defined over a tuple of variables (v1,..., vr) specifies how these variables can be assigned values from their domain. The constraint C can be seen as a relation from the tuple {v1,..., vr} to the cartesian product of these variables' domains, i.e. C(v1,..., vr) ⊆ D1×...×Dr. The size of this tuple is known as the arity of the constraint C. A constraint C on a tuple of variables (v1,..., vr) is satisfied by a tuple of values (d1,..., dr) if and only if 1. ∀(1≤i≤r) di ∈ Di (the domain of vi) and, 2. (d1,..., dr) ∈ C(v1,..., vr). Much work in the literature focused on binary constraints since it has been proven that any n-ary discrete CSP can be translated into an equivalent binary CSP [3]. A solution to a CSP problem—when it exists– consists of a set of values (d1,..., dn)
where each value di is assigned to a different variable vi such that all the constraints in R are satisfied. To illustrate these concepts let us consider the well known N-queen problem where we have to place N queens on an N by N chessboard such that no two queens are placed in the same row, in the same column, or in the same diagonal. One formalization of the N-queen problem as a CSP uses variables to represent the rows of the chessboard [17], where each variable can take values from 1 to N corresponding to the column position of the queen placed in that row. Hence: Dv1 = Dv2 =…= DvN = {1,2,…,N}. To ensure that no two queens are on the same column and no two queens are on the same diagonal, we use two constraints: ∀i # j vi # vj and, ∀i # j if vi = a and vj = b then |i - j| # |a - b|. CSPs are generally solved using exhaustive search with backtracking. When a variable is assigned a value, all the constraints involving this variable are examined. This is known as constraint propagation, since the constraint reduces the domains of the related variables. In the N-queen problem, if we determine the column position of the queen on row i to be j, that removes j from the domains of all the other variables. During constraint propagation, if a variable's domain becomes empty, the algorithm backtracks to the last choice point where a value was assigned, and tries another untried value. If there are no other values to consider the algorithm backtracks to the preceding choice point, and so on. If there are no more choice points and a variable's domain is empty, the problem has no solution. When a variable's domain is reduced to one value, the variable is said to be bound [4]. 4.2 The Graph Pattern Matching Problem as a CSP Rudolf proposed to translate graph matching problems into equivalent CSP problems [15]. Using this approach for untyped graphs, each node and each arc of the pattern is mapped to a distinct variable where for each node variable, the domain is the set of nodes in the target graph, and for each edge variable, the domain is the set of edges in the target graph. The restrictions of a graph homomorphism are expressed using constraints. Consider the following example from [15] where we look for occurrences of the graph L into the graph G (Fig. 6). In this approach, we have three variables x1, x2, x3 whose domains are D1 = D3 = {d1, d3, d4, d6} and D2 = {d2,d5} where Di represents the domain of the variable xi. The homomorphism properties are embodied in two constraints that state that when x1, x2 and x3 are bound, then the value assigned to x2—which is an edge—has a source equal to the value assigned to x1 and a target equal to the value assigned to x3: Csrc(x2,x1) = { (de,dv) ∈ D2 x D1 | source(de) = dv}={(d2, d1), (d5,d6)}1 Ctar(x2,x3) = { (de,dv) ∈ D2 x D3 | target(de) = dv}={(d2, d3), (d5,d6)}
1
e stands for edge and v stands for vertice, source and target are total mapings from the set of edges to the set of vertices.
These constraints reduce the domains of the variables, leading to two solutions, (d1,d2,d3) and (d6,d5,d6). G L
d3 d2
x3
d1
x2
d6
x1 d4
d5
Fig. 6. A simple example of graph pattern matching
5 Building CSPs from Problem Models Our goal is to apply Rudolf's approach to find instances of a problem model in a given input model. Thus we need to translate our pattern matching problem into a CSP representation. Practically, this means generating a set of variables and a set of constraints for a given problem model and, computing the domains of the variables for a given input model. The process of generating CSPs from problem models and input models is illustrated in Fig. 7.
Variables extractor
Set of variables
Constraints extractor
Set of constraints
Problem model
Domains extractor
Input model
The CSP generator
A CSP solver
Solutions*
Sets of domains
CSP problems
*a solution is an instance of a problem model in an input model
Fig. 7. The process of generating CSPs from problem models
5.1 Extracting Variables For a given design pattern problem model, the variables consist of all the elements of the model, namely, its classes, operations, attributes, and all of the (meta-level) relationships between them (association, inheritance, etc.). For example, the problem model of the Visitor pattern (Fig. 3) yields the following variables, with selfexplanatory names: var_AbstractClass, var_ConcreteClass, var_AbstractOperation, var_ConcreteOperation, var_inherits_from_1, var_inherits_from_2, var_has_method, var_has_message, var_implements, and var_overrides. The first four variables correspond to the entities of the problem model (its classes), and the last six represent
associations between those entities. Note that we use two inherits_from variables to distinguish the inherits_from relationship between ConcreteClass’es and AbstractClass, from the inherits_from relationship between ConcreteClass’es. 5.2 Extracting Domains Following our visitor example, given an input object model, we need to identify the domains for the ten variables mentioned above. Domain extraction raises an issue that is common to all CSPs, and has to do with the level of precision of domains—and the effort to expend for domain extraction. Consider the variable var_AbstractClass. We could define the domain for this variable to be the set of all the abstract classes in the input model, but then we have to develop an automated procedure to identify those in the input model. Alternatively, we could define the domain as all the classes of the input model, but then add a unary constraint to the set of constraints that identifies abstract classes among those. In fact, we could even use the entire input model as a domain, and have the unary constraint test for the “classness” and “abstraction” of the domain values. There is an advantage to having precise domains: the CSP solution process is more efficient, since there are fewer values to try. The disadvantage, from a representation point of view, is that some essential problem knowledge is buried in non-explicit, non-declarative procedures (domain extractors), as opposed to being expressed explicitly and declaratively in constraints. Dealing with UML models is a mixed blessing: both “classness” and “abstractness” are represented explicitly2, and thus computing the domains for var_AbstractClass and var_ConcreteClass is straightforward. Things are slightly more complicated with var_AbstractOperation and var_ConcreteOperation. UML does provide the Operation class, but the “abstraction” of operations is a programming concept. Thus, we can use all of the operations of the input model as domains for var_AbstractOperation and var_ConcreteOperation and add constraints to identify abstractness/concreteness, or, enrich the domain extraction procedure to make that distinction. Things get yet a bit more complicated with the variables var_inherits_from_1, var_inherits_from_2, var_has_method, var_has_message: these variables represent associations that are embodied/hardcoded in the UML metamodel, as opposed to being represented explicitly by entities. Indeed, in the UML metamodel, the association between a class and its methods is implicit in the definition of the class: it is not represented by an explicit entity of the UML metamodel. Idem for the inheritance relationship: it is an association in the UML metamodel itself (between classes), and not an entity3. Things get yet more complicated with the variable var_overrides—and the underlying “overrides” relationship between operations. This has no representation in UML metamodels in any shape or form (entity, attribute, association), and has to be computed from the ground up4. 2
3
4
In UML, the « abstractness » of a Class is represented by a boolean attribute (isAbstract) inherited from Classifier. Such associations are usually represented as attributes of the UML metamodel classes. For example, Class has an attribute superClass, which can take a list of Class’es. For example, by identifying operations that have the same signature, and leaving the inheritance aspect as a constraint.
5.3 Extracting Constraints Constraints represent relationships between the problem variables: they are what ties the problem model together. A number of such constraints are used to bind the various association ends to the associated entities. Figure 8 shows excerpts from the visitor problem model, and how the various components map to CSP components. As mentioned before, AbstractClass, ConcreteClass, and inherits_from are mapped to three variables var_AbstractClass, var_ConcreteClass, and var_inherits_from_1. The domains for these variables are all abstract classes, all concrete classes, and all inherits-from relationships in the input model, respectively. To match an “A inheritsfrom B” fragment in the input model, where A is an abstract class and B is a concrete class, we need two constraints that we call cons_inherits_1_src and cons_inherits_1_tar, defined as follows: cons_inherits_1_src(var_AbstractClass,var_inherits_from_1) ≡ var_inherits_from_1.source = = var_AbstractClass cons_inherits_1_tar(var_inherits_from_1,var_ConcreteClass) ≡ var_inherits_from_1.target = = var_ConcreteClass Similar pairs of constraints are used to link the association ends for has_method, has_message, implements, and overrides.
Fig. 8. An illustration of the mapping from problem model to variables and constraints
Finally, we define a constraint to represent the homomorphism between the hasmessage association between AbstractClass’es and AbstractOperation’s, on one hand, and ConcreteClass’es and ConcreteOperation’s, on the other.
6 Implementation
6.1 The Pattern Meta-model and the Transformation Rules We implemented our language for representing patterns as a meta-model [12]. To do so, we used the Eclipse Modeling Framework™ (EMF) which includes a package called ECore that implements a lightweight version of MOF. Our meta-model was implemented as an extension to ECore. We also implemented our transformation rules using ILOG JRules. JRules is a hybrid object-rule system where the condition and action parts of rules refer to Java objects and methods. In our case, conditions and actions refer to elements of problem and solution models represented as instances of the ECore meta-model. The transformation rules are described in detail in [6]. 6.2 The Pattern Matching Strategy with a CSP Solver A major decision during the implementation of a CSP pattern matching strategy was to choose a solver. Having variables whose domains consist of objects—as opposed to scalar values—was a determining factor. We choose the ILOG JSolver™ toolset which is a constraint solver written in Java and designed to work against Java object models. Another significant advantage of JSolver in our context is its support for user-defined constraints coded in Java. This gives us the full power of the Java language to define new constraints. Indeed, we implemented our constraints as extensions to a generic user constraint class provided by JSolver. Practically the implementation of each constraint implied the implementation of domain reductions of its variables. To do so, JSolver provides domain modifiers (e.g. removeDomainValue) that we used in the propagation method of a constraint to eliminate the values that can not be part of a solution. When the domain of a variable is modified, the examination of the constraints on this variable is triggered [4]. We implemented the variables, the constraints and domain extractors in Java. The variables extractor module reads a problem model and maps its entities and their relationships into variables, as explained in section 5.1. The domain extractor module provides the necessary methods to extract the various domains from input models. As mentioned in section 5.2, we took advantage of the UML metamodel—as supported by EMF’s ECore—to extract domains for the most common entity types (abstract classes, concrete classes, interfaces, etc.).The constraint extractor reads a problem model, identifies its associations, and generates the corresponding source and target constraints, as explained in section 5.3. The process flow is as follows: the CSP generator (Fig. 7) instantiates the variable extractor, the constraints extractor and the domains extractor. Then it builds the CSP by retrieving the variables from the variables extractor, retrieving the corresponding constraints form the constraints extractor and setting the variables' domains by applying the domain extractor to the input model. Practically, we need to implement a CSP generator for each problem model, even though the different generators share a
lot of functionality. So far, we coded CSP generators for four design patterns: the visitor, bridge, composite, and strategy. 6.3 Post-processing JSolver Output Notwithstanding issues related to time hotposts (see section 3), the output produced by solver does not identify instances of design pattern problems as we would expect them. Given the model fragment show in Figure 9 (the proverbial example for the visitor pattern), and our codification of the problem model, JSolver will return four separate solutions delimited by the various dashed lines. This happens for two reasons. The first is a common problem in pattern matching, which we will illustrate with a simple example. Assume that we want to match the regular expression a*b* in some text. If my target string is cdfiaaabbxyz, then my expression a*b* will match ab, aab, aabb, aaab, and aaabb. A regular expression tool needs to be told, or coded in such a way, that it returns the longest matching sequence. JSolver returns all of the fragments that satisfy the set of constraints, and in those cases where we have recursive associations, it will generate paths with different lengths. This explains why it identified both Expression Node and ArithmeticExpression Expression Node as solutions. Thus, we need to collect and collate such solutions to identify the largest model fragment that fits the pattern.
→
→
→
Node GenerateCode() PrettyPrint() ...
Expression GenerateCode() PrettyPrint() ...
VariableRefNode GenerateCode() PrettyPrint() ...
AssignmentNode GenerateCode() PrettyPrint() ...
ArithmeticExpression GenerateCode() PrettyPrint() ...
Fig. 9. JSolver produces problem model instance fragments which need to be collated into a single problem instance.
→
→
The second problem is a limitation of the version of JSolver™ that we used, which explains why we get Expression Node, VariableRefNode Node, and AssignmentNode Node as separate solutions, as opposed to paths within the same solution. Indeed, to get all of the subclasses of a given class in one swoop, we need to be able to define variables whose domains are sets of objects, as opposed to individual objects. The version of JSolver that we used supports only arrays of numbers, and
→
requires us to specify the size of the array as part of the definition of variable, which is not practical, since we don’t know the size before hand. We continue to explore ways around this limitation. For the time being, for a given problem model, we collect the solutions produced by JSolver and organize them in a “forest” based on common subpaths; once we are done, each tree in the forest represents a different problem instance. We applied our approach to sample input models, and were able to detect instances of these patterns where appropriate. We are currently working on implementing a graphical user interface that enables us to highlight instances of the various design patterns within input models.
7 Discussion
7.1 Related Work Our approach to detect instances of design problems solved by design patterns in models is related to the approaches that were proposed to tackle the problem of detecting patterns in object oriented systems. These approaches can be organized by the target application and by the technique used. Indeed, while the approaches in [1] and [5] aim at detecting design problems, the first uses metrics to evaluate the source code quality and, the second uses logic programming to do so. In both cases, design problems are defined as violations of object oriented heuristics and hence are of a lower level of abstraction than the design problems solved by design patterns. The approach in [2] uses logical functions to specify Java specific elemental patterns that compose design patterns. Elemental patterns are more suitable for automation but they do not provide as much design information as design patterns [9]. The approach in [8] uses constraint programming to detect design patterns in the source code. To the best of our knowledge, all these approaches focused on detecting instances or approximate instances of the solutions proposed by design patterns not the problems they are meant to solve. While [8] focuses on detecting solutions provided by design patterns, it is an interesting approach because it uses explanation-based constraints to detect approximate instances of these solutions. Although at the implementation level, there are some differences on how CSPs are used: relationships (e.g. composition) in [8] are considered as constraints over the variables, which requires to process the detected solutions (e.g. there could be more than one composition relationship between two classes and we need to retrieve all of them). Recognizing a problem model in a given input model is a graph homomorphism problem. To tackle this problem, there are two main trends within model transformation approaches based on graph grammar. Indeed, approaches like [18] and [10] use search plans for the traversal of the LHS (left hand side) part of a rule while [15] uses constraint satisfaction techniques. The approach in [18] proposes to generate search plans using typical models of the system at hand. Depending on the graph pattern to be matched and the current model to be matched into, an optimal search strategy for the traversal of this graph pattern is derived from search plans. In the
same trend, [10] proposes to represent search plan operations as cost-weighted predicates to achieve an adaptive ordering of these operations in particular the complex ones (e.g. negative application condition checking operation). This adaptive ordering enables to minimize the total cost of all the matching process. However, there is an overhead when calculating the cost of all possible search plans and the proposed representations of search plans are not intuitive. Somehow graph matching strategies are based on some backtracking algorithms. The approach in [15] argues that most of the studies to optimize these algorithms are done in the context of CSPs. Hence to benefit from the research findings in CSPs, [15] proposes to translate the graph pattern matching problem into a CSP. Contrary to [18], this approach decouples the solution algorithm from the concrete model at hand. However, translating the graph pattern into a CSP representation involves a computation overhead which depends on the size of the graph pattern (i.e. the LHS of a transformation rule). In our context, this graph pattern is a problem model whose size is typically small as demonstrated by the Visitor problem model example, i.e. the average longest pattern path is three (aSubClass → aClass → aSuperClass). 7.2 Outstanding Issues In our current implementation, we detect complete instances of a given problem model. However, some input models can include incomplete instances of some of our problem models. Our strategy to find these incomplete instances is to define, for a given design pattern, the minimum problem model and specify the set of transformation rules accordingly. To do so we need to specify which entities of the problem model are optional regarding the design problem specification, i.e. even if these entities are removed from the problem model, it still corresponds to the design problem solved by the given pattern. The set of mandatory entities and their relationships form the minimum problem model. Hence when no instance of a problem model is found, we remove from the corresponding constraint problem the variables representing its optional entities and we relax the constraints in which these variables are involved. We remove a variable at a time and if still no instance is found, we remove two variables at once, and so on. The retained instance is the one having the maximal size. Another important problem to consider is the existence in an input model of overlapping instances of design problems. Indeed, we need to run tests on several larger models to check if there are any conflicts when different instances of the same pattern or different patterns apply to the same fragment of an input model. Design patterns relationships as described in [19] can help to implement an approach to solve these potential conflicts.
8 Conclusion In this paper, we proposed a semi-automatic tool for marking models using constraint satisfaction techniques. This tool aims at providing a framework for recognizing
design problems solved by design patterns and rewriting them according to the appropriate solutions. The key element of our approach is the explicit representation of design problems solved by design patterns. Indeed representing a design problem by an explicit model enables to recognize instances of this problem model and to specify declaratively the related design pattern’s application by means of rules describing the transformation of this problem model into the corresponding solution model. Furthermore, our approach can be used to specify and apply other kind of patterns.
References 1. Alikacem, E., Sahraoui, H.A.: Détection d'anomalies utilisant un langage de description règle de qualité. In Rousseau R., Urtado C., Vauttier S. (eds.). LMO’06, pp. 185--200 (2006) 2. Arcelli, A., Masiero, S., Raibulet, C.: Elemental Design Patterns Recognition In Java. In: 13th IEEE Int. Work. Software Technology and Engineering Practice, pp. 196--205 (2005) 3. Bacchus, F., van Beek, P.: On the Conversion between Non-Binary and Binary Constraint Satisfaction Problems. In: 15th Conf. on Artificial Intelligence pp. 311--318 (1998) 4. Chun, A.: Constraint Programming in Java with JSolver. In: 1st Int. Conf. and Exhibition on the Practical Application of Constraint Technologies and Logic Programming (1999) 5. Ciupke, O.: Automatic Detection of Design Problems in Object-Oriented Reengineering. In: TOOLS 30, pp. 18--32. IEEE Computer Society Press (1999) 6. El-Boussaidi, G., Mili, H.: A Model-driven Framework for Representing and Applying Design Patterns. In: 31st COMPSAC, vol. 1, pp. 97-100 (2007) 7. Gamma, E., Helm, R., Johnson, R., Vlissides, J.: Design Patterns: Elements of Reusable Object-Oriented Software. Addison-Wesley (1995) 8. Guéhéneuc, Y-G., Jussien, N.: Using Explanations for Design Patterns Identification. In: IJCAI'01 Workshop on Modelling and Solving problems with constraints, pp. 57-64 (2001) 9. Gil, J., Maman, I.: Micro Patterns in Java Code. In: 20th conference on Object oriented programming systems languages and applications, OOPSLA’05, pp. 97--116 (2005) 10. Horvath, A., Varro, G., Varro, D.: Generic search plans for matching advanced graph patterns. In: the 6th International GT-VMT Workshop (2007) 11. Model Driven Architecture Guide, http://www.omg.org/cgi-bin/doc?omg/03-06-01 12. Mili, H., El-Boussaidi, G.: Representing and Applying Design Patterns: What Is the Problem?. In: ACM/IEEE 8th MODELS. LNCS, vol. 3713, pp. 186--200 (2005) 13. Mili, H., Mili, F., Mili, A.: Reusing software: Issues and research directions. In: IEEE Transactions on Software Engineering, vol. 21, no. 6, pp. 528--562 (1995) 14. Partsch, H., Steinbruggen, R.: Program Transformation Systems. In: Computing Surveys, vol. 15, no. 3, pp. 199--236 (1983) 15. Rudolf, M.: Utilizing Constraint Satisfaction Techniques for Efficient Graph Pattern Matching. In: 6th International Workshop on Theory and Application of Graph Transformations, LNCS, vol. 1764, pp. 238--251. Springer, London (1998) 16. Sahraoui, H., Boukadoum, M., Lounis, H., Ethève, F.: Predicting Class Libraries Interface Evolution: an investigation into machine learning approaches, In the 7th APSEC (2000) 17. Tsang, E.P.K.: Foundations of Constraint Satisfaction. Academic Press (1993) 18.Varro, G., Friedl, K., Varro, D.: Adaptive Graph Pattern Matching for Model Transformations using Model-sensitive Search Plans. In: Karsai, G., Taentzer, G. (eds.) GraMot 2005, ENTCS, vol. 152, pp. 191--205 (2006) 19. Zimmer, W.: Relationships between design patterns. In: Pattern Languages of Program Design. Addison-Wesley (1994)