checking whether a specific transformation is a refactoring. If the transformation .... paragraph introduces a type and a collection of relations, called fields, along ...
Automatically Checking the Correctness of Feature Model Refactorings Rohit Gheyi, Tiago Massoni, Paulo Borba Federal University of Pernambuco { rg,tlm,phmb }@cin.ufpe.br
Abstract Software Product Line evolution can be performed by refactorings, which involve not only program refactorings improving code structure, but also feature model (FM) refactorings in order to improve configurability. However, the current catalog of FM refactorings is not complete, and is difficult to propose new sound refactorings. Moreover, there is no guideline how to apply such catalog in a refactoring process. We propose an efficient encoding of FMs in Alloy, which is a formal specification language. Based on this encoding, we show how the Alloy Analyzer tool can be used to automatically check in a few seconds whether FM refactorings are correct.
1
Introduction
Adoption strategies for Software Product Lines (SPL) [5] frequently involve bootstrapping existing products into a SPL (extractive approach) and extending an existing SPL to encompass another product (reactive approach), or their combination [5]. Extractive and reactive approaches can be enacted by the application of program refactorings. However, the definition of program refactoring [8] does not take into account intrinsic characteristics of SPL, such as feature models (FM) [6], which are explained in Section 2. For instance, using program refactorings in a SPL may have the undesirable effect of reducing its configurability (instances of a SPL), which is not useful in practice. In order to avoid that, an extended definition for refactoring SPL, which is accomplished by program and FM refactorings, was proposed [1]. A FM refactoring is sound when improves the quality of a FM by improving (maintaining or increasing) its configurability. Refactoring designers are responsible to propose new general refactorings (templates). Extract Class [8] is an example of a general refactoring. Developers can use them in a refactoring process (specific transformation). A catalog containing a number of general FM refactorings was proposed [1]. However, this catalog is not proved to be com-
plete. Therefore sometimes developers may have to refactor FMs based on manual reasoning, which is time-consuming and an error-prone activity. In order to solve this problem, refactoring designers have to increase the catalog. However, it is difficult for refactoring designers to propose new sound FM refactorings, as discussed in Section 3. Checking soundness by manual reasoning or in a theorem prover following another approach [9] is hard and time-consuming, and requires extra expertise. Another problem is that there is no guidelines for developers to use the catalog in a refactoring process, as discussed in Section 3. If developers do not know which FM refactorings should be applied in each situation, they may waste time trying to figure out the correct sequence of refactorings to be applied. In this paper, we propose an efficient encoding of FMs in Alloy [14] (Section 4), which is a formal specification language. Based on this encoding, the Alloy Analyzer [15], which is a tool used to perform analysis on Alloy models, helps refactoring designers to automatically check whether general FM refactorings are sound. Moreover, this approach helps developers in a refactoring process by automatically checking whether a specific transformation is a refactoring. If the transformation is not a refactoring, the tool generates a counterexample. The main contributions of this paper are the following: • An efficient encoding for feature models in Alloy (Section 5); • An automatic way to check whether a general FM transformation is a refactoring (Section 6.1); • An automatic way to prove in a refactoring process whether a specific FM refactors another (Section 6.2). This encoding allows refactoring designers to check general refactorings with thousands of features in a few minutes using the Alloy Analyzer (Section ??). In a refactoring process, developers can always prove whether one FM refactors another.
2
Feature Models
In this section, we give an overview of FMs. A FM represents the common and variable features of a SPL and the dependencies between them [6]. A feature diagram, which is a tree, is a graphical representation of a feature model. Relationships between a parent feature and its child features (or subfeatures) are categorized as Optional (features that are optional – represented by an unfilled circle), Mandatory (features that are required – represented by an filled circle), Or (one or more must be selected – represented by a filled triangle), and Alternative (exaclty one subfeature must be selected – represented by a unfilled triangle). Figure 1 depicts these relationships graphically.
Figure 1. Feature Diagram Notations Besides these relationships, we allow FMs to include formulas about features. For instance, the formula earphone ⇔ mp3 states that the feature earphone is selected iff the feature mp3 is selected. Figure 2 depicts a simplified FM for a mobile phone. A mobile phone may have an earphone. Moreover, it may have at least an mp3 player or a digital camera. Finally, a mobile phone has an earphone iff it has an mp3 player. So, the FM has four features (mobilephone, earphone, mp3 and camera), one formula (earphone ⇔ mp3) and two relations: an optional relation between mobilephone and earphone, and an or feature relation between mobilephone, mp3 and camera.
the model in Figure 2. However, the configuration { mobilephone, earphone } is invalid because the or feature relation between mobilephone, mp3 and camera states that whenever mobilephone is selected, at least mp3 or camera must be selected. Another work [1] defines a FM refactoring as a transformation that improves the quality of a FM by maintaining or increasing its configurability. So the resulting FM contains all valid configurations of the initial FM, but may contain more. Refactoring FMs is very important when refactoring SPL. Besides using the known program refactorings [8] with compilation and tests, we must check whether the configurability of the SPL is maintained or increased. A number of general FM refactorings were proposed by refactoring designers that can be applied based on template matching [1]. Next we give an overview of the notation used to state them. Each general refactoring consists of two templates (patterns) of FMs, on the left-hand (LHS) and right-hand (RHS) sides. We can apply a refactoring whenever the left template is matched by the FM. A matching is an assignment of all meta-variables occurring in LHS/RHS models to concrete values. Any element not mentioned in both FMs remains unchanged, so the refactoring templates only show the differences between the FMs. Moreover, a dashed line on top of a feature indicates that this feature may have a parent feature. A dashed line below a feature indicates that this feature may have additional subfeatures. For example, Refactoring 2 depicts a general transformation that collapses an optional feature and an or feature relation into a general or feature relation encompassing all features. The transformation increases the configurability of the resulting FM by allowing the configuration { A, B }. It is important to mention that A, B, C and D are metavariables. Refactoring 2 hcollapse optional and ori
Figure 2. Feature Model Example
The semantics of a FM is the set of its possible (valid) configurations. A configuration contains a set of feature names; if valid, it satisfies all constraints (relations and formulas) of the model. For example, the configuration ({ mobilephone, camera }) is valid for
For instance, the FM depicted in Figure 2 matches the LHS FM template of Refactoring 2. The meta-variables A,
B, C and D are matched with mobilephone, earphone, mp3 and camera, respectively. Developers can apply these general transformations in a refactoring process. For instance, we can apply Refactoring 2 to the specific FM depicted in Figure 2 in order to collapse an optional feature and an or feature relation into a more general or feature relation encompassing all features.
3
ure 3 refactors the LHS FM. However, if we have a catalog containing a large number of refactorings, developers may waste time trying to choose the appropriate sequence of refactorings that should be applied in order to show that the RHS FM refactors the LHS FM.
Problems in FM Refactorings
In this section, we show two problems that arise when checking soundness and applying FM refactorings. Problem 2 (Section 3.1) is related to developers (applying specific transformations in a refactoring process), whereas Problem 1 (Section 3.2) is related to refactoring designers (checking general refactorings).
3.1
Checking General FM Refactorings
A catalog of FM refactorings was proposed [1]. Since the catalog is not proved to be complete, sometimes developers may have to do manual reasoning in order to check whether a transformation they wish to perform is indeed a refactoring, that is, improves configurability. However, it is difficult, time-consuming and error-prone for developers to directly reason about FM semantics to check the configurability issue. In order to solve this problem, refactoring designers have to propose more refactorings. However, there are no guidelines to help them to propose new sound refactorings. They have to prove FM refactorings manually or using a theorem prover following a previous approach [9]. As it is widely known, theorem proving is a demanding task and requires sophisticated expertise. It requires domain experts. For instance, FM refactorings proposed have more than a hundred of pages of proofs [9]. Another problem of proving refactorings is that if failing to prove them states nothing about them. In order to show that a transformation is not a refactoring, refactoring designers have to find a configuration (counterexample) showing that it belongs to the initial FM but it does not belong to the final FM. It is not easy to find a counterexample considering FMs with hundreds of features. Therefore, refactoring designers may waste time trying to prove something that it is not a refactoring. So, the current approach for proving FM refactorings is not practical. Refactoring designers need a more effective way to increase the catalog of sound FM refactorings.
3.2
Checking Refactoring Process
Figure 3 depicts two FMs of a mobile phone. Using the catalog, developers can show that the RHS FM in Fig-
Figure 3. Feature Model Refactoring?
In a refactoring process involving several steps and with a big catalog, it is not easy for developers to know the exact sequence of refactorings that should be applied in order to relate both models. There is no guidelines to help developers stating which FM refactorings should be applied in each situation. Figure 3 depicts FMs containing a small number of features, and the refactoring chain is small. However, the problem is even worse when considering FMs containing a large number of features, and a refactoring chain with many steps. So, developers may waste time trying to choose the correct sequence of FM refactorings to be applied in a refactoring process. In the worst case, developers may waste time trying to prove something that is not a refactoring. If they cannot relate the models using the catalog, they cannot state anything about them. This is a problem that also happens in program refactorings [8]. In order to alleviate this problem, some bad smells [8] are proposed. For each bad smell, there are some program refactorings that might be applied. However, even having this, developers may waste time trying to choose the correct sequence of program refactorings that should be applied in order to relate both programs. In this paper we propose an approach in order to address both problems. In our approach, we help refactoring designers and developers by automatically checking general refactorings and refactoring processes, respectively. Moreover, we also help them by showing counterexamples when the transformation is not a refactoring, that is, reduces configurability.
4
Alloy
In this section, we overview Alloy 4. We specify an efficient encoding of FMs in Alloy in order to solve the problems mentioned in Section 3. We choose Alloy of its tool, which can perform analysis, because it was appropriate of easy to use for formulating our particular type of problems. An Alloy model or specification is a sequence of paragraphs of two kinds: signatures that are used for defining new types, and constraints, such as facts. Each signature denotes a set of objects, which are associated to other objects by relations declared in the signatures. A signature paragraph introduces a type and a collection of relations, called fields, along with their types and other constraints on their included values. Next, we model in Alloy part of a banking system, on which each bank is related to sets of accounts. The following fragment declares some signatures (sig) and relations. In Bank’s declaration, the set qualifier specifies that accs associates each element in Bank to a set of elements in Account. Each customer has an identifier. Moreover, accounts may be checking or savings. In Alloy, one signature can extend another, establishing that the extended signature (subsignature) is a subset of the parent signature. sig Bank { accs: set Account } sig Customer { id: Id } sig Account { owner: set Customer } sig ChAcc, SavAcc extends Account {} sig Id {} A fact (fact) packages formulas that always hold, such as invariants about the elements. The following example introduces a fact named BankInvs, establishing general properties about the previously introduced signatures. Each account is related to exactly one customer. The all keyword represents the universal quantifier. The one keyword, when applied to an expression, denotes that the expression has exactly one element. The dot operator . is a generalized definition of the standard relational join operator. For instance, the join of acc.owner, where acc is an account and owner is a binary relation that relates accounts to customers, yields the owners of an account. fact BankInvs { all acc:Account | one acc.owner }
Predicates (pred) are used to package reusable formulas. The subsequent fragment declares the member predicate that checks whether an account belongs to a bank. The in keyword denotes the set membership operator. pred member[b:Bank, a:Account] { a in b.accs }
4.1
Analysis
Alloy has paragraphs that are used for guiding analysis with the Alloy Analyzer tool [15]. This tool can be used to verify whether some property holds for a pre-defined scope. A scope defines the maximum number of objects allowed for each signature during analysis. The tool automatically searches all possible situations up to a given scope. Assertions (assert) are another formula paragraph, which declares a set of questions about a model. Suppose in the previous banking system example that we would like to know whether distinct (disj) customers have distinct identifiers. The following assertion declares this constraint. assert differentId { all c1,c2: Customer | disj[c1,c2] => c1.id != c2.id } check differentId for 3 The ! operator denotes negation. The Alloy Analyzer tool can check whether the previous assertion is valid up to a given scope. In the previous fragment, we declare a check command with a scope of 3. We want to check this property in all possible situations containing at most three banks, three accounts, and so on. Scopes for analysis can be specified for each signature separately. Performing analysis using the previous check command, the tool generates a counterexample in which two distinct customers have exactly the same identifier. This means that the property does not hold from the models constraints. The simulations performed by the Alloy Analyzer tool are sound and complete up to a given scope. If there is some instance that contradicts an assertion up to a given scope, the tool shows the counterexample. However, if the tool does not find any counterexample, we only know that the property holds on that scope. We cannot conclude that the formulas declared in the assertion are valid for a greater scope since the tool is not a theorem prover. By increasing the scope, however, we can gain greater confidence. In some specific domains, we exactly know the number of instances involved of each signature. In this situation, we can
perform a complete analysis.
}
5
The ! operator denotes negation. The or and alternative relations between more than two subfeatures can be specified similarly. Our encoding approach abstracts formulas’ syntax. We directly encode them based on their semantics. They are specified similarly to the previous predicates.
Feature Models in Alloy
In this section, we specify an efficient encoding for FMs in Alloy. Our aim is to use this encoding in order to solve the problems presented in Section 3.
5.1
Efficient Encoding
A FM has a set of feature names. In our encoding, we only have two signatures (FM and Name) representing all elements of a FM. sig FM { features: } sig Name {}
set Name
A FM may have some relations (Figure 1). We specify one predicate for each FM relation. Suppose that A is related to a subfeature (child) B. A configuration (conf) is represented by a set of feature names. Next we specify the optional and mandatory relations between A and B in predicates. Moreover, we state a predicate for the root (root) of a FM. For a given configuration (conf), the root feature must be included. pred optional[A,B:Name,conf:set Name] { B in conf => A in conf } pred mandatory[A,B:Name,conf:set Name]{ A in conf B in conf } pred root[A:Name,conf:set Name] { A in conf } The in, => and operators denote subset, if and iff, respectively. Suppose that A is related to two subfeatures B and C. Next we specify the or and alternative relations in predicates. pred orFeature[A,B,C:Name, conf:set Name] { A in conf (B in conf) or (C in conf) } pred alternative[A,B,C:Name conf:set Name] { orFeature[A,B,C,conf] B in conf => C !in conf C in conf => B !in conf
5.2
Example
Suppose that we would like to specify the FM in Figure 2 using the encoding presented in Section 5.1. Firstly, we declare its elements. We declare a singleton (one) signature, which has exactly one object, for each element. The FM in Figure 2 is represented by M, which extends a FM, and has four features. A singleton signature is declared for each feature name. Finally we state the M’s features in a fact, as declared next. one sig M extends FM {} one sig mobilephone,earphone mp3,camera extends Name {} fact MFeatures { M.features = mobilephone+earphone+mp3+camera } The + operator denotes the set union operator. Next we explain how to specify the semantics predicate for the FM in Figure 2. This predicate contains all constraints of the FM. Firstly, a configuration must include a subset of the feature names of the FM. Moreover, the root of the FM must always be included, as declared next. Both constraints, which are called implicit constraints, must appear in semantics of all FMs. pred semanticsM[conf: set Name] { conf in M.features root[mobilephone,conf] Then we state all relations of the FM using the predicates presented in Section 5.1. The model of Figure 2 has two relations: an optional relation between mobilephone and earphone features, and an or feature relation between mobilephone, mp3 and camera. Next we specify both relations. optional[mobilephone,earphone,conf] orFeature[mobilephone,mp3,camera,conf] Finally, after specifying all relations and implicit constraints, now we specify all explicit formulas. The FM of Figure 2 has one formula: earphone ⇔ mp3. Each for-
mula’s operator is directly translated to equivalent one in Alloy. Every occurrence of a feature name is appended with in conf. In Alloy, we can express first-order logic formulas. Therefore, any first-order formula can be directly translated to Alloy. Next we show the translation of the earphone ⇔ mp3 formula to an equivalent one in our encoding.
Finding a Valid Configuration: Besides checking a property, the Alloy Analyzer can be used to find a solution (instance) for a predicate using the run command. For example, we can perform analysis to show valid configurations of M by running the semanticsM predicate using the following command. run semanticsM for 1, 4 Name
earphone in conf mp3 in conf } So, for a FM, the semantics predicate must specify the two implicit constraints, all relations and formulas declared. The relations are specified using the predicates in our encoding, and formulas are translated to equivalent ones in Alloy by appending in conf to each name in the formula. This translation from FM to Alloy is systematic and can be easily implemented by a tool. So far we have shown a encoding for FMs and how we can use it to specify some specific FMs in Alloy. Next we are going to explain how we can benefit from the analyses performed by the Alloy Analyzer tool.
5.3
Running the semanticsM predicate yields a solution, which is depicted in Figure 4. This solution represents a valid configuration of the semantics of the FM in Figure 2. It has the FM M with four feature names (mobilephone, earphone, mp3 and camera) and a configuration including the features mobilephone and camera. Each feature name selected in Figure 4 is labeled with conf ig.
Analysis Figure 4. Valid Configuration
Based on the previous encoding, we can perform automatic analysis on FMs using the Alloy Analyzer. As explained in Section 4.1, the Alloy Analyzer performs a complete analyses up to a given scope. Figure 2 has exactly one FM and four feature names. Since we exactly know the number of objects of all signatures (FM and Name) in our encoding, we can perform a complete analysis using the Alloy Analyzer in the FM of Figure 2. Next we show two useful analyses that can be performed. Checking a Configuration: Sometimes it is useful to check whether a specific configuration is valid for a FM. In order to that, we can use Alloy’s assertions. For example, the following assertion states whether selecting mobilephone and mp3 is a valid configuration for M. assert validConfig { semanticsM[mobilephone+mp3] } check validConfig for 1, 4 Name Performing analysis on the previous assertion yields a counterexample indicating that this assertion is invalid. So selecting mobilephone and mp3 is not a valid configuration for M. This configuration does not satisfy the earphone ⇔ mp3 formula of the model. Running the previous assertion considering a configuration containing mobilephone and camera does not yield a counterexample. So, this is a valid configuration for M.
The Alloy Analyzer allows us to find other solutions. In our case, this implies that performing analysis using Alloy Analyzer on semanticsM allows us to automatically know all valid configurations of a FM. Both analyses take less than a second to be performed. Since most of the time we will perform analysis on specific FMs, such as the FM in Figure 2, we will exactly know the number of elements of the FMs involved. Therefore, we can perform a complete analysis using the Alloy Analyzer in those situations.
6
FM Refactorings
In this section, we explain how the FM encoding in Alloy presented in Section 5 and the Alloy Analyzer can help to automatically check whether general FM refactorings are sound (Section 6.1), and to refactor FMs in a refactoring process (Section 6.2).
6.1
Checking General FM Refactorings
As mentioned before, the catalog of FM refactoring is not complete (Problem 1). Refactoring designers need to propose more FM refactorings. However, there is no guideline to help them to check whether they are sound. Proving FM refactorings manually or in theorem provers is difficult
and time-consuming, and requires domain expertise. So, it is difficult to increase the catalog of FM refactorings. We propose an efficient encoding in Alloy to prove FM refactorings up to a given scope. 6.1.1
Specification
Suppose that a refactoring designer proposes Refactoring 1, which allows to convert an alternative relation between variables A, B and C to an or feature relation. Refactoring 1 hconvert alternative to ori
B and C, whereas the RHS FM has an or feature relation between A, B and C. Refactoring designers should specify those relations using the predicates in our encoding. pred semanticsM1[conf: set Name ] { conf in M1.features alternative[A,B,C,conf] } pred semanticsM2[conf: set Name ] { conf in M2.features orFeature[A,B,C,conf] } Analysis: In order to check whether the transformation is a refactoring, the following assertion states the definition of a FM refactoring (Section 2). The assertion specifies whether all valid configurations of the LHS FM are valid configurations of the RHS FM.
In order to check whether this reafctoring is sound in our encoding, refactoring designers must specify the syntactic relationship, semantics of each FM and an assertion. Only some parts, which are called hot spots, of the specification need to be specified. The other parts are the same for all FM refactorings, as we are going to explain next. This approach is systematic and can be easily implemented by a tool. Syntax: Refactoring 1 has two FMs. The LHS and RHS FMs are represented by M1 and M2. Moreover, it has three features: A, B and C. All FMs and names are represented by signatures containing exactly one element. one sig M1,M2 extends FM {} one sig A,B,C extends Name {}
assert refactoring { all conf:set Name | semanticsM1[conf] => semanticsM2[conf] } check refactoring for 2 FM, 10000 Name Our encoding only declares two signatures: FM and Name. Since we have two FMs, the only scope that refactoring designers should provide when checking refactorings is the number of features. The Alloy Analyzer does not find any counterexample using a scope of 10000 features. So, Refactoring 1 can be applied to any FM that matches the LHS template and has less than or equal to 10000 features. However, in practice, we do not have FM containing more than 10000 features. Therefore, we can say that in practice the Refactoring 1 is sound. 6.1.2
The A, B and C features belong to both FMs. As explained before, all elements not mentioned in a refactoring must remain the same. Therefore both models have the same features. fact SyntacticRelationship { A+B+C in M1.features M1.features = M2.features } Semantics: After specifying the syntactic relationship, refactoring designers must specify semantics of both FMs in predicates. Each FM has an implicit constraint: a configuration must include a subset of the FM’s features. Moreover, the LHS FM has an alternative between features A,
Formulas
Both FMs in Refactoring 1 have the same relations and formulas, except for the alternative and or feature relation. As mentioned before, relations are represented by formulas using the predicates explained in Section 5.1. So, a set of formulas (relation and explicit formulas), which are called forms, in both FMs are not depicted in the transformation. All elements not mentioned in a refactoring remain the same in both FMs. So, both FMs must have forms. In order to specify a precise semantics definition, refactoring designers have to specify forms in the semantics predicate of both FMs in Refactoring 1, as declared next. pred semanticsM1[conf: conf in M1.features
set Name ] {
alternative[A,B,C,conf] forms } pred semanticsM2[conf: set Name ] { conf in M2.features orFeature[A,B,C,conf] forms } However, we do not specify forms in Section 6.1.1. In fact, we do not need to specify forms since both FMs have forms. The definition of FM refactoring states that all valid configurations of the LHS FM must be valid configurations of the RHS FM. If a configuration conf satisfies the semantics of the LHS FM, it must satisfies textttforms. Since it satisfies forms in the LHS FM, it also satisfies forms in the RHS FM. As a consequence, refactoring designers do not need to specify forms, and focus only on the differences of both FMs. A similar argument explains why refactoring designers do not specify the root implicit constraint. Both FMs have the same root, hence the same implicit constraint. Therefore, they do not need specify it. So, refactoring designers do not need to specify any constraint that appears in both FMs of a refactoring template. 6.1.3
Counterexample
Sometimes refactoring designers may propose transformations that are intended to be refactorings, but they are not. In our approach, we can help them by giving some counterexamples showing why a transformation is not a refactoring. Suppose that the refactoring designer thought that applying Refactoring 1 from right to left also defines a refactoring. We specify the following assertion in order to check it. assert wrongRefactoring { all conf:set Name | semanticsM2[conf] => semanticsM1[conf] } check wrongRefactoring for 2 FM, 3 Name The Alloy Analyzer yields a counterexample (configuration), in which the features A, B and C are selected, by checking the previous assertion. This configuration belongs to the RHS FM of Refactoring 1, but does not belong to the LHS FM (B and C cannot be selected at the same time). Therefore, applying Refactoring 1 from right to left is not a refactoring since the LHS FM does not contain all configurations of the RHS FM. This counterexample is generated in less than a second.
One problem of the previous approach [9] (proving refactorings in a theorem prover) is that in order to show that a transformation is not a refactoring, refactoring designers have to find a counterexample. This activity can also be time-consuming. In our approach, the Alloy Analyzer automatically generate in a few seconds a counterexample improving the understanding of refactoring designers why it is not a refactoring. 6.1.4
Generalization
In order to check a FM refactoring, refactoring designers have to specify four hot spots in our encoding: • extend Name with all features in the refactoring; • specify a syntactic relationship (the features relation) between both FMs; • specify all relations and formulas in the predicates defining FM’s semantics; • specify the number of features for analysis. Figure 5 depicts another example of how to specify a FM refactoring. Only the dashed rectangles (hot spots) need to be specified by refactoring designers. This approach is systematic. We can build a tool to automatically generate a specification in Alloy from a FM refactoring. The refactoring designers just need to specify the transformation (they will not need to know anything about Alloy), and the tool will say whether it is a FM refactoring.
6.2
Checking Refactoring Process
Sometimes developers have the initial (original FM) and final (desired FM) models and they would like to relate them using the catalog of refactorings, especially during refactoring process (chains). Since there are no guidelines in order to help developers stating which FM refactorings should be applied in each situation, it may be difficult and timeconsuming to know the exact sequence of refactorings that should be applied to relate them (Problem 2). Our approach solves this problem by directly translating the initial and final FMs in our encoding and using an assertion to check whether they are related by a refactoring chain. For instance, developers can specify both FMs depicted in Figure 3. Next we specify the LHS and RHS FMs, which are represented by M1 and M2 respectively. one sig M1,M2 extends FM {} one sig mobilephone,earphone mp3,camera extends Name {} fact SyntacticRelationship { M1.features = M2.features
Figure 5. Specifying a FM Refactoring
M1.features = mobilephone+earphone+mp3+camera } pred semanticsM1[conf: set Name ] { conf in M1.features root[mobilephone,conf] orFeature[mobilephone,mp3,camera,conf] optional[mp3,earphone,conf] } pred semanticsM2[conf: set Name ] { conf in M2.features root[mobilephone,conf] orFeature[mobilephone,mp3,camera,conf] optional[mobilephone,earphone,conf] } assert refactoring { all conf:set Name | semanticsM1[conf] => semanticsM2[conf] } check refactoring for 2 FM, 4 Name Checking the previous assertion does not yield a counterexample. Since we exactly know the number of features (4) involved in the transformation, the analysis performed by the Alloy Analyzer is complete. Therefore, we proved that the RHS FM refactors the LHS FM. As another example, suppose that developers would like
to check whether the LHS FM refactors the RHS FM. In this case, the Alloy Analyzer yields a counterexample. It shows a configuration including mobilephone, earphone and camera. This is a valid configuration of the RHS FM, but it is invalid for the LHS FM. Therefore, it is not a refactoring. Although the FMs depicted in Figure 3 are small, this approach is very useful when considering FMs containing a large number of features and big refactoring chains. If it is not a refactoring, the counterexample generated improves developers’ understanding why it is not a refactoring. The analyses performed by the Alloy Analyzer are fast. In general, it takes a few minutes for FMs containing thousands of features. We used a laptop with 2GHz processor and 1GB RAM memory to perform analysis on 19 FM refactorings [1] up to 10000 features. The maximum amount of time required to check a refactoring is 10 minutes. It is important to mention that refactoring designers just need to check them once. After that developers can use FM refactorings whenever they are dealing with FMs containing less than or equal to 10000 features. Since in practice most of FMs have less than 10000 features, we can apply them most of the time. Eleven refactorings proposed [1] were proved using a theorem prover. It has more than 700 pages of proofs [9]. In the approach presented here, using a scope of 10000 features, we can automatically check whether all eleven refactorings are sound in at most three hours using the current
version of the Alloy Analyzer.
7
Related Work
Batory [3] integrates prior results to connect feature diagrams, grammars, and propositional formulas. This connection also allows the use SAT solvers to help debug feature models by confirming compatible and incomplete feature sets. He explains in more details how to check three properties: (1) a FM has a contradiction, (2) the user selects some features and the tool yields a value to the remaining features that satisfies the FM semantics, and (3) a configuration is a valid for a FM. All three properties can be checked in the Alloy Analyzer. We explained how to check the first and third property in Section 5. If performing analysis on the semantics predicate does not yield any configuration (solution), we have a contradiction (inconsistency). The second property can be checked by the following predicate. The Alloy Analyzer yields valid configurations of FM depicted in Figure 2 that selects mobilephone and camera. As mentioned before, the Alloy Analyzer allows us to find all solutions. Besides these properties, we show how the Alloy Analyzer can be useful for checking meta-properties (general refactorings). pred show[conf: set Name] { mobilephone+camera in conf semanticsM[conf] } Another work [18] proposes a textual language for describing features. Their language is similar to ours, but do not consider formulas. They propose a notion of FM semantics that is equivalent to ours. Also, a set of fifteen rules relating equivalent FMs are proposed, which are very similar to bidirectional refactorings. They informally argue soundness, in contrast with our approach, which uses a model checker to increase the confidence. A related approach [4] proposes an automatic way to analyze five properties of FMs, such as yield the number of instances and all instances of a FM, and check whether a FM is valid. They present a mapping to transform an extended feature model into a Constraint Satisfaction Problem in order to formalize extended feature models using constraint programming. In our work, we can check these properties. Their idea of filters is equivalent to formulas in our FMs. Our theory has a limited support for integer expressions due to Alloy, in contrast to their work. A related approach proposes [16] Feature Oriented Refactoring (FOR), which is the process of decomposing a program, usually legacy, into features. Such work focuses on configuration knowledge, specifying the relationships between features and their implementing modules,
backed by a solid theory. Also, the authors present a semiautomatic refactoring methodology to enable the decomposition of a program into features. However, FOR focuses on bootstrapping a SPL from an existing application, rather than a model, as we explore in our work. Czarnecki et al. [7] introduces cardinality-based feature modeling as an integration and extension of existing approaches. They specify a formal semantics for FMs with these features and translate cardinality-based FMs into context-free grammars. Another work [2] presents a FeaturePlugin, which is a feature modeling plug-in for Eclipse. This plug-in implements cardinality-based feature modeling [7]. The tool supports cardinality-based feature modeling, specialization of feature diagrams, and configuration based on feature diagrams. In our work, we can check whether a configuration belongs to a FM. However, we do not handle cardinality-based FMs. We also note that their formal treatment of FM specialization could be seen as the opposite of our notion of FM refactoring. So, specializations can be analyzed similarly. A related approach [17] presents a case study in feature refactoring. They refactor the AHEAD Tool Suite. Feature refactoring is defined as the process of decomposing a program into a set of features. Another work [13] proposes an algebra that is used to describe and analyze the commonalities and variabilities of a system family. None of the previous works aim at proposing an automatic way to check FM refactorings. To our knowledge, we do not know any other work in refactoring programs or other kinds of models that helps refactoring designers to automatically check whether refactorings are correct. A previous work [11] proposes an encoding for FMs in Alloy. In both encodings we can check whether a transformation is a refactoring. However, this work [11] specifies a more detailed FM semantics than our work. As a consequence, they can check more meta-properties, such as whether a refactoring preserves the FM wellformedness rule, than our work. In contrast, in this paper, we propose a much more efficient encoding specific for FM refactorings than theirs. For instance, we can check whether refactorings are sound up to 10000 features. However, they can only check up to 20 features.
8
Conclusions
In this paper, we propose an efficient encoding for FMs in Alloy. We show that this encoding and the Alloy Analyzer are useful for automatically checking whether general FM refactorings are sound, and to applying them in refactoring processes. Moreover, this approach not only asserts when a transformation is a refactoring, but also shows a counterexample when it is not a refactoring. FM refactorings can be very useful during maintaining or evolving a
SPL using program refactorings. Analysis of FMs containing thousand of features was performed within a few minutes. This approach may be useful in practice since most FMs have less than thounsand features. We specified 19 FM refactorings proposed elsewhere [1] in our encoding and checked in the Alloy Analyzer. All specifications can be found online [12]. We can perform faster analyses with the evolution of the Alloy Analyzer and SAT Solvers. Comparing the analysis performance of Alloy 3 and 4, Alloy 4 is at least 10 times faster than Alloy 3. As a future work, we aim at building a FM refactoring tool or extending the FeaturePlugin [2] to automatically generate Alloy specifications from a FM refactoring. Developers and refactoring designers will not need to know anything about Alloy to perform analysis. We intend to investigate whether this approach is also useful for checking structural model (such as Alloy models and UML class diagrams) refactorings [10].
References [1] V. Alves et al. Refactoring product lines. In GPCE, pages 201–210, USA, 2006. [2] M. Antkiewicz and K. Czarnecki. Featureplugin: feature modeling plug-in for eclipse. In Eclipse, pages 67–72, 2004. [3] D. Batory. Feature models, grammars, and propositional formulas. In 9th SPLC, volume 3714 of LNCS, pages 7–20. Springer, 2005. [4] D. Benavides, A. Ruiz-Cort´es, and P. Trinidad. Automated reasoning on feature models. 17th Conference on Advanced Information Systems Engineering, 3520:491–503, 2005. [5] P. Clements and L. Northrop. Software Product Lines : Practices and Patterns. Addison-Wesley, 2001. [6] K. Czarnecki and U. Eisenecker. Generative Programming: Methods, Tools, and Applications. Addison-Wesley, 2000. [7] K. Czarnecki, S. Helsen, and U. W. Eisenecker. Formalizing cardinality-based feature models and their specialization. Software Process: Improvement and Practice, 10(1):7– 29, 2005. [8] M. Fowler. Refactoring: Improving the Design of Existing Code. Addison-Wesley, 1999. [9] R. Gheyi et al. Theory and proofs for feature model refactorings in PVS. Technical report, UFPE, 2006. http://www.mit.edu/˜gheyi/splc.htm. [10] R. Gheyi, T. Massoni, and P. Borba. A rigorous approach for proving model refactorings. In ASE, pages 372–375, 2005. [11] R. Gheyi, T. Massoni, and P. Borba. A theory for feature models in alloy. In Alloy Workshop, pages 71–80, 2006. [12] R. Gheyi, T. Massoni, and P. Borba. Specification of feature model refactorings in Alloy. http://www.mit.edu/˜gheyi/splc.htm, 2007. [13] P. Hofner, R. Khedri, and B. Moller. Feature algebra. In Formal Methods, volume 4085 of LNCS, pages 300–315. Springer-Verlag, 2006. [14] D. Jackson. Software Abstractions: Logic, Language and Analysis. MIT press, 2006.
[15] D. Jackson, I. Schechter, and I. Shlyakhter. Alcoa: the alloy constraint analyzer. In ICSE, pages 730–733, 2000. [16] J. Liu, D. Batory, and C. Lengauer. Feature oriented refactoring of legacy applications. In 28th ICSE, pages 112–121, 2006. [17] S. Trujillo, D. Batory, and O. Diaz. Feature refactoring a multi-representation program into a product line. In GPCE, pages 191–200, 2006. [18] A. van Deursen and P. Klint. Domain-specific language design requires feature descriptions. Journal of Computing and Information Technology, 10(1):1–17, 2002.