Extraction of feature models from formal contexts - Semantic Scholar

4 downloads 230 Views 545KB Size Report
cess, feature models have to be generated automatically from existing product ..... concepts representing redundant features (s3, fn, thin-lined boxes), which are ...
Extraction of Feature Models from Formal Contexts Uwe Ryssel, Joern Ploennigs, Klaus Kabitzsch Institute of Applied Computer Science Dresden University of Technology Dresden, Germany

uwe.ryssel, joern.ploennigs, [email protected]

ABSTRACT For economical reasons, the creation of feature oriented software should include previously created products and should not be done from scratch. To speed up this migration process, feature models have to be generated automatically from existing product variants. This work presents an approach based on formal concept analysis that analyzes incidence matrices containing matching relations as input and creates feature models as output. The resulting feature models describe exactly the given input variants. The introduced novel optimized approach performs this transformation in reasonable time even for large product libraries.

Categories and Subject Descriptors D.2.7 [Software Engineering]: Distribution, Maintenance, and Enhancement—Restructuring, reverse engineering, and reengineering; D.2.9 [Software Engineering]: Management—Software Configuration Management

General Terms Algorithms, Design

Keywords Feature Models, Formal Concept Analysis

1.

INTRODUCTION

Creating feature oriented software from scratch is a timeconsuming task. In many cases product lines exist already and were developed for years. Discarding them and restarting the development is usually not economical, and a manual conversion to a feature oriented software product line is expensive as well. An automatic migration of the existing variants and an automatic creation of the corresponding feature model would decrease the effort to create a product line significantly. Such a migration is usually done in four steps: The first step is to search the product line for common or similar artifacts. In the second step, groups of similar product artifacts

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. SPLC’11, August 21–26, 2011, Munich, Germany Copyright 2011 ACM 978-1-4503-0789-5/11/08 ...$10.00.

are merged to form configurable components. Identifying the variation points, features, and their dependencies is the next step. Using that information the feature model and the configurable component can be created. There are several approaches for the analysis of variation points. One example was given by the authors for the analysis of function-block-based models: In [1], a model matching and clustering approach is used to find and group structurally similar function-block-based models. Then in [2], the matching relations are analyzed with formal concept analysis [3] resulting in preliminary, restricted feature models, which only support options and unconditioned alternatives. In this work, complete feature models should be generated, which support options, alternatives, and or-relations. The input for the presented approach is an incidence matrix (see Tab. 1 for an example), which describes the common and different artifacts of variants. This matrix can be for example created by the approach described in [1]. Although the presented approach should work with any kind of products, it should be exemplified for function-blockbased models. Function-block-based models decompose the functionality of systems in function blocks. They are encapsulated algorithms with input and output data points used for communication and parameters influencing the properties of the algorithm. The data flow among data points is specified by bindings that connects them. This kind of models is structurally similar to UML component diagrams with their components, required and provided interfaces, and connectors, but they have a different semantics. Component diagrams have a service-oriented paradigm, in contrast to function-block-based models, which are data-flow oriented. Tab. 1 shows an incidence matrix, which is used as example in this work. It shows the parts of a set of functionblock-oriented models that describe different controllers of a DC motor. Each of the ten rows represents a certain controller variant. There are one open-loop controller and four closed-loop controllers. All exist in a test environment and a production environment. Each column represents a set of coexisting model artifacts, i.e., function blocks, connections, and parameter settings. The task is to find correlations among the model artifacts: Which artifacts exclude each other, or which artifacts only occur in a combination of other existing or missing artifacts. These dependencies should be used to generate feature models that describe exactly the existing variants, such that only the given variants are valid feature configurations. In the next section the relevant basics of formal concept analysis will be described. Sec. 3 introduces attribute concept graphs that form the base to create feature diagrams (Sec. 4) and implications (Sec. 5), which result in feature models with constraints and feature mappings. Performance

Table 1: Incidence matrix of model variants and its merged model artifacts 1 2 3 4 5 6 7 8 9 10

Open-Loop (test) P (test) PI (test) PD (test) PID (test) Open-Loop (non-test) P (non-test) PI (non-test) PD (non-test) PID (non-test)

m × × × × × × × × × ×

t × × × × ×

n

p × × × ×

× × × × ×

i

d

s

× ×

× × ×

× ×

h ×

ft

fn

× × × ×

cp

× ×

× ×

× × ×

cn

s2

s3

× × × ×

× × × × ×

ct ×

× × × × ×

× × × ×

m: DC motor, t: Test, n: Non-test, p: P part, i: I part, d: D part, s: Part sum, h: Transfer function, ft : Feedback (test), fn : Feedback (non-test), cp : P-only connection, ct : To-transfer connection (test), cn : To-transfer connection (non-test), s2 : Part sum has two inputs, s3 : Part sum has three inputs

considerations and case study results of this new approach are presented in Sec. 6. Finally, related approaches of other authors are discussed in Sec. 7.

2.

FORMAL CONCEPT ANALYSIS

This section will introduce the basics of formal concept analysis that are needed to understand the introduced approach of this work. Further details can be found in [3]. The formal concept analysis is a mathematical method to find so-called concepts and their hierarchies in a formal context. A formal context consist of a set of objects G, a set of attributes M , and a has-relation I between them. Contexts are usually represented by an incidence matrix between the objects and attributes. In this matrix a cross means that an object has an attribute. The matching result of the last section (see Tab. 1) forms a formal context between the variants (the objects) and model artifacts (the attributes) . A formal concept of a formal context is, in an illustrative definition, a maximal filled rectangle of crosses in the context without considering the order of columns and rows. Formally, a concept is a pair of a set of objects, the extent, and a set of attributes, the intent, and can be calculated by an operator 0 , which is defined for sets of objects and sets of attributes. For a set A ⊆ G of objects, A0 is defined as the set of common attributes of all objects in A, and for a set B ⊆ M of attributes, B 0 is defined as the set of common objects of all attributes in B. So 0 transforms between objects and attributes. The operator 0 can be applied repeatedly where (A0 )0 can be written as A00 . This results in the concepts (A00 , A0 ) and (B 0 , B 00 ) respectively. The righthand side of Fig. 1 shows three concepts, created from the controller context. The first line is the extent and the second line the intent of the concept. For B = {i, d}, B 0 will be {5, 10} (the objects, which contain both i and d) and B 00 will be {m, s, p, i, d, s3 } (the attributes the objects 5 and 10 share), resulting in the concept Cs3 of Fig. 1. Concepts that are derived from a single attribute are called attribute concepts. For any a ∈ M , ({a}0 , {a}00 ) is the corresponding attribute concept. All three concepts of Fig. 1 are attribute concepts. They base on the bold typed attributes i, d, and s3 . Object concepts are defined analogously, but are not needed in this work. The set of all concepts have a partial order, so that each concept can have super- and subconcepts. This order is defined by the subset relation among the object sets of the concepts. Since infimum and supremum are defined for the set of concepts, they form a concept lattice [3]. The lattice

generated from the context of Tab. 1 has 34 concepts and is outlined on the left-hand side of Fig. 1. A detail of that lattice is shown on the right-hand side: Cs3 is a subconcept of Ci and Cd , and Ci and Cd are superconcepts of Cs3 . As there are no other concepts in between, they are also lower and upper neighbors, respectively.

Figure 1: Three neighboring concepts and their position in the example lattice There are different algorithms to calculate the complete set of concepts and the lattice structure. With NextConcept [4] the concepts can be calculated one-by-one in a lexical order. Neighbors [5] calculates the upper or lower neighbor concepts of a given concept, so the complete lattice structure can be determined. Using this lattice, implications among the attributes can be derived. They are important to indentify dependencies, which are needed to create feature models. An implication P ⇒ C, where P and C are attribute sets (P, C ⊆ M ), holds iff (P 0 , P 00 ) ≤ (C 0 , C 00 ), i.e., iff the concept derived from P is a subconcept or equal to the concept derived from C [3]. An implication between sets is defined by the conjunction of their elements. So the set implication {a, b} ⇒ {c, d} means a ∧ b ⇒ c ∧ d, i.e., if both attributes a and b exist in an object, c and d will also exist in the same object. The created implications do not contain negated literals, since they always consist of conjunctions of plain attributes. To find implications like ¬a ⇒ b, the context has to be extended by the negative counterparts of the attributes. So M = {p, i, d, . . .} extends to M ∗ = {p, ¬p, i, ¬i, d, ¬d, . . .}. A context can also contain so-called reducible attributes. When removing such a reducible attribute from the context, its lattice structure will not change. Only the intents of some concepts are reduced. For implications this means that any reducible attribute mr ∈ M can be presented by a combination of other attributes m1 , . . . , mn ∈ M , which holds the equivalence m1 ∧ . . . ∧ mn ⇔ mr . For example, s3 in Fig. 1

is a reducible attribute as it is equivalent to i ∧ d. This issue can be later used to reduce the number of features. A minimal implication base, the so-called stem base or Duquenne-Guigues base [6], which is complete and has no redundant implications, can be calculated by existing algorithms [4, 7]. But, these algorithms calculate both the implication base and the complete set of concepts. This is not a problem for the small example used in this work, but it can be critical for greater examples with many (> 20) alternative model artifacts resulting in many attributes. Since the number of concepts |L(G, M, I)| increases exponentially with the relation size |I|, i.e., the number of crosses, the number of concepts increases fast to millions. Particularly the negated attributes create contexts with many crosses. A realistic example shows the dimensions: A context created for a 26 variants example with 79 attributes contains an implication base of 86 implications and over 67 million concepts. The calculation of this example takes more than three hours. Other examples will take days. So, methods that need a complete lattice or depend on the existing algorithms for calculating implication bases will not scale and are not appropriate for our task. In the next section an alternative to the existing algorithms and methods is described, which does not depend on the complete lattice and existing implication base algorithms.

3.

only the intent difference has to be calculated. The upper neighbors can be calculated by the Neighbors algorithm of Lindig [5] without having need of the whole lattice. In contrast to the complete number of concepts, the number of attribute concepts grows only linearly with the number of clarified attributes. Thus, the exponentially growing number of concepts is caused by the other non-attribute concepts. Their high number results from the high number of concept-creating object and attribute combinations. Next to the attribute concepts itself, their order hierarchy is needed as well. The Neighbors algorithm cannot be used to efficiently find the next upper and lower attribute concepts and to avoid the non-attribute concepts at the same time, because there may be non-attribute concepts in between, resulting in a breadth-first search in the set of superor subconcepts. An efficient method to get the hierarchy is to create a graph with the attribute concepts as nodes. As mentioned before, each attribute concept can be generated directly using the operator 0 without calculation of any other concepts. An edge is added for each pair of the calculated attribute concepts, whose intent or extent (both will work equivalently) is related by subset. Because of the transitivity of the subset relation, the edges form a transitive closure. A transitive reduction [8] will remove the redundant edges. The remaining edges represent the upper and lower neighbor relations among the attribute concepts, the attribute concept graph.

ATTRIBUTE CONCEPT GRAPHS

Since the complete lattices are too large, which prevents efficient analysis of concepts and implications, another calculation base is needed to get the dependencies for feature models. In this section such a base is defined by attribute concepts. As described in the last section, each concept consists of a set of objects and of a set of attributes. Because of the subset relation of the concept hierarchy, there is some kind of inheritance among the concepts. As shown in Fig. 1, objects are inherited from the lower neighbor concepts and attributes are inherited from the upper neighbor concepts. So for each concept the extent and intent can be divided into objects and attributes respectively, which are inherited and which are not inherited. The non-inherited attributes can be calculated by removing the intents of the upper neighbor concepts from the intent of the given concept. This intent difference M∆ of a given concept (A, B) is defined as: [ M∆ ((A, B)) := B \ {BN | (A, B) ≺ (AN , BN )} (1) To get M∆ (Cs3 ) of the example in Fig. 1, {m, s, p, i} and {m, s, p, d} have to be removed from {m, s, p, i, d, s3 }, which results in M∆ (Cs3 ) = {s3 }. This is the attribute, which is typed in bold face in Fig. 1. For each attribute concept, the intent difference M∆ is non-empty and contains equivalent 0 00 attributes, whose concept (M∆ , M∆ ) is the given attribute concept itself. Also each subset and thus also each single attribute in M∆ will generate the same attribute concept. So, there is a one-to-one relation between attribute concept and its intent difference. If the context is clarified, i.e., if all equal columns and rows of the context are merged, M∆ will contain maximal one attribute. The advantage of the clarification is the reduced number of attributes, which speeds up the calculations. Because of the close relation between attributes and attribute concepts, attribute concepts can be used as representatives for (a set of) attributes. For concepts that are not attribute concepts, M∆ will be always empty. To check if a concept is an attribute concept,

Figure 2: A lattice and its attribute concept graph Fig. 2 shows an example lattice and its attribute concept graph as a part of it. Each node contains the extent and the intent difference of each concept. The graph is not a lattice itself, because the infimum and supremum concepts of a set of any attribute concepts are not necessarily member of the graph. This has to be considered when using the graph as lattice replacement. Using the attribute concept graph, primitive implications of the form a ⇒ b can be derived directly from graph structure: As defined in the last section, an implication a ⇒ b (a, b ∈ M ) holds iff the attribute concept of a is a subconcept of the attribute concept of b. So in the attribute concept graph each edge, and also each edge in the transitive closure, will represent a primitive implication. In Fig. 2 the direction of the edges corresponds to the implication direction. Thus, each attribute concept graph is also an implication graph, which is the base for deriving the feature model structure in the next section.

4.

CREATING FEATURE DIAGRAMS

A feature model is usually presented by a feature diagram in the form of a tree and, if necessary, additional constraints. Tree edges (the relationship between a parent and its child feature(s)) can be marked as mandatory (child feature is required), optional (child feature is optional), or (at least one of the child features have to be selected) and alternative (exactly one of the child features have to be selected). These restrictions will hold, if the parent feature is selected. If a parent feature is not selected, it will be not allowed to select any child features. For the additional constraints, which cannot be described in the tree, any propositional formula can be used. The feature tree itself can be represented as propositional formulas as well [9]. So in general, each feature tree is a view of a set of propositional formulas. Usually, it is possible to define different feature models, whose semantics is equivalent. So, the single right feature model for a given dependency set does not exist in most cases.

in the lattice. Unfortunately, this property cannot be read from the attribute concept, since the number of neighbors is not necessarily the same in lattice and graph. Thus, the number of upper neighbors has to be counted in the lattice, which can be done by calculating all upper neighbors with Neighbors. If there is more than one upper neighbor, the corresponding attribute concept has to be marked as reducible. Such marked attribute concepts, such as j in Fig. 2, represent features, whose selection is mandatory, if all parent features (in this case e, f , and g) are selected. So e ∧ f ∧ g ⇔ j holds. Thus, the redundant feature j is not needed and can be removed from the feature model. The feature mappings can be created later by a conjunction of the parent features. This reduced attribute concept graph is used for the creation of feature models instead. Fig. 3 shows the reduced attribute concept graph of the controller example. In this case the graph is a tree, but this is not the general case. Instead of the 15 existing attributes, the reduced attribute concept graph contains only ten. The five missing attributes ft , fn , ct , cn , and s3 are reducible.

Table 2: Semantics of feature relations Relation (SF: subfeature)

Semantics

fo is an optional SF of f

fo ⇒ f

fm is a mandatory SF of f

fm ⇔ f

f1 , . . . , fn are or SFs of f

f 1 ∨ . . . ∨ fn ⇔ f

f1 , . . . , fn are alternative SFs of f

(f1 ∨ . .^ . ∨ fn ⇔ f ) ∧ ¬(fi ∧ fj ) i) with m, because implications which conclude to m (. . . ⇒ m) are always true, since m exists in all variants. The bottom concept (⊥), although not an attribute concept, is always included, because it is needed by the implication extraction algorithm. But, can results of different attribute concept graphs be used together? In this case, it is possible. This can be proven by comparing the behavior of the lattice and its attribute concept graph when attributes are added to the context. In the lattice, adding new attributes will change existing or add new concepts, but never remove existing concepts. This property was also used by Obiedkov and Duquenne in [7] to calculate the implication base and all concepts by adding attributes one-by-one. A changed concept keeps its extent and gets an intent extended by the new added equivalent attributes. The same holds for attribute concepts. The attribute concepts that exist yet in the reduced attribute graph, representing features, are marked with a thick border in Fig. 5. The attribute concepts of p, h, t, and n are extended by negated attributes, because they are two alternative pairs. As described in Sec. 3, the primitive implications can be derived directly from the attribute concept graph edges. Implications of the reduced attribute concept graph, which are used implicitly during the feature diagram extraction in Sec. 4, are also valid in this graph. However, it is difficult to find the other implications. How can the attribute concept graph be used to find these implications? As mentioned in Sec. 2, an implication of attributes P ⇒ C (P, C ⊆ M ) holds if the concept (P 0 , P 00 ) is a subconcept or equal to the concept (C 0 , C 00 ). Any concept (P 0 , P 00 ) can be expressed as an infimum of attribute concepts:  (P 0 , P 00 ) = inf (p01 , p001 ), . . . , (p0n , p00n ) for P = {p1 , . . . , pn }.

(2)

Because of the infimum definition, the extent of the infimum concept is defined by the intersections of the attribute concepts’ extents: \ 0 P0 = pi (3) 0≤i≤n

The same holds for the conclusion attribute set C, but in this work only single-attribute conclusions C = {c} should be used. This is not a restriction, because each implication a ⇒ c1 ∧ c2 can be rewritten to (a ⇒ c1 ) ∧ (a ⇒ c2 ). So, finding implications in the attribute concept graph means finding attribute concepts (p01 , p001 ), . . . , (p0n , p00n ), and (c0 , c00 ) that hold:     \ 0 \  p i  ⊆ c0 ⇒  p0i \ c0  = ∅ ⇒ 0≤i≤n



0≤i≤n

 [

 0≤i≤n



p0i \ c0  = ∅ ⇒ 

 [

G \ (p0i \ c0 ) = G

(4)

0≤i≤n

First, the extent c0 is subtracted from both sides. Then both sides are negated and the De Morgan’s law for sets is applied. The complement sets are resolved based on the complete set of objects G. The result is a set-cover based on the extents of attribute concepts. To get minimal implications, the solutions of a minimal set-cover problem have to be found for each attribute concept (c0 , c00 ). As shown in (4), the complete set to cover is G and the candidate sets are derived from the other attribute concepts. Using Alg. 2, all implications can be extracted. After the simple calculation of the primitive implications (lines 1–3), the complex ones are calculated. First the candidate attribute concepts are determined. To prevent the extraction of redundant implications, only a subset of the attribute concepts are used (line 5). To get these, take the lower neighbors l in the attribute concept graph (G ) and then their upper neighbors u in the lattice (≺L ). The candidates are all attribute concepts p that are greater or equal to the concepts u. This restriction ensures that implications are not found multiple times and increases the performance significantly. However, the primitive implications cannot be found in this way, which requires the extra calculation step of lines 1–3. Reducible attribute concepts will always create redundant implications and can be omitted from the

candidate list. The cover candidates (lines 6–9) are calculated based on (4). The solutions of the minimal set-cover problem are used to get the implication premises (line 11). Some of the found implications are still redundant, but they can be removed in polynomial time. The result is a set of premises, each creating an implication with c as conclusion. The resulting implication base is defined among attribute concepts, but can be rewritten to attributes. A proof for completeness of this algorithm is presented in [11]. Algorithm 2 Finding all implications Input: attribute concept graph ACG = (Cattr ∪ ⊥, E) 1: for all edges (s, t) ∈ E, where s 6= ⊥ do 2: addImplication(s ⇒ t) 3: end for 4: for all attribute concept c in ACG do 5: Ccand ← (p ∈ Cattr \ (Cred ∪ {c}) | c G l ≺L u ≤ p) 6: S←∅ 7: for all p ∈ Ccand do 8: S ← S ∪ {G \ (ext(p) \ ext(c))} 9: end for 10: Smin ← minSetCover(G, S) 11: P ← convert(Smin ) 12: addImplications(P ⇒ c) 13: end for

Although this algorithm is based on set-covers with exponential complexity, the results are calculated in seconds for realistic examples. This is because the problem size is decreased significantly, since only a subset of attribute concepts is used, instead of all concepts of the lattice. So the runtimes of this algorithm is far better than runtimes of hours up to days that are needed using existing algorithms. For the controller example, the algorithm extracts 34 primitive and 23 complex implications and takes around 100 ms. The complex implications are shown below: p ∧ ¬cp ⇔ s



p ∧ ¬s ⇔ cp

¬s2 ∧ ¬i ⇔ ¬s

F

¬d ∧ ¬i ⇔ ¬s

F

¬s2 ∧ ¬d ⇔ ¬s

of the feature model. But they are needed to create the feature mapping. The feature mapping consists of annotations of model artifacts as described in [12]. For each feature there is a one-to-one mapping, and for redundant features the mapping is based on the implications calculated above.

6.

PERFORMANCE

To evaluate the performance of the introduced algorithms, first the complexity of using whole lattices and attribute concept graphs is analyzed. The complexity of calculating one concept based on a set of objects or attributes is O(|G| × |M |). The neighbors of a concept can be calculated in O(|G|2 × |M |) and the complete lattice (concepts and hierarchy) in O(|L| × |G|2 × |M |) [5], where |L| is the number of concepts, which increases exponentially to the number of object-attribute relations |I|. In worst case, this results in exponential runtime. The attribute concept graph instead, which is used as lattice replacement in this work, can be calculated in polynomial time: First the |M | attribute concepts and their upper neighbors (needed to check if they are reducible) are calculated. This can be done in O(|G|2 × |M |2 ). Then, the edges have to be created using pairwise comparison (O(|M |2 )) and transitive reduction (O(|M |3 )). Thus, the complete attribute concept graph can be created in O(|G|2 × |M |2 + |M |3 ). Although exponential, the introduced algorithms to find or- and alternative relations and implications have to solve only problems, whose input size is only |G| × |M |, contrary to the usage of exponentially large lattices. Only the usage of attribute concept graphs makes it possible to calculate feature models from contexts in reasonable time. Table 3: Case study results |O|×|M | #F

∗ F

d ∧ ¬i ⇒ s2

F

¬d ∧ i ⇒ s2

F

s ∧ ¬i ⇒ s2

F

s ∧ ¬d ⇒ s2

F

p ∧ ¬s2 ∧ ¬i ⇔ cp

F

p ∧ ¬s2 ∧ ¬d ⇔ cp t ∧ p ⇔ ft ¬s2 ∧ i ⇔ s3 s ∧ ¬s2 ⇔ s3 t ∧ ¬cp ∧ ¬s2 ∧ ¬d ⇔ ct n ∧ ¬cp ∧ ¬s2 ∧ ¬i ⇔ cn n ∧ ¬cp ∧ ¬d ∧ ¬i ⇔ cn

F

n ∧ p ⇔ fn d ∧ i ⇔ s3 ¬s2 ∧ d ⇔ s3 t ∧ ¬cp ∧ ¬s2 ∧ ¬i ⇔ ct t ∧ ¬cp ∧ ¬d ∧ ¬i ⇔ ct n ∧ ¬cp ∧ ¬s2 ∧ ¬d ⇔ cn

Some of the shown implications are equivalences, because there are primitive implications that describe the inverse direction. Only two of the complex implications are covered by the feature diagram and can be omitted. They are marked with an ∗ . They belong to the alternative relation between s and cp as shown in Fig. 4b. The implications marked with F are implications that describe further dependencies among the features. They are feature constraints which have to be added to the feature model. These constraints specialize the or-relation among d, i, and s2 : If feature s is selected, exactly two of its three subfeatures d, i, and s2 have to be selected, too. The other implications describe redundant features. They are not used here, since the redundant features are not part

I II III IV V VI VII

17×66 26×79 64×51 43×98 33×134 45×150 63×125

16 34 18 23 41 42 26

#R

#C

td (ms)

ti (ms)

tdg (ms)

5 8 8 11 8 19 10

57 0 77 91 0 1771 431

67 523 49 125 355 76 s 634

73 292 121 390 506 15 s 1586

141 3.2 h 156 1265 >3 d 123 s 2530

#F: number of features, #R: number of relations, #C: number of feature constraints, td : runtime for feature diagram extraction, ti : runtime for implication extraction and SAT reduction, tdg : runtime calculating Duquenne-Guigues base

The described algorithms were implemented in Java using Colibri-Java [13] for concept calculation and Sat4J [14] for removing implications that are covered by the feature tree. As case study, contexts were used that were generated by automatic comparison and matching of existing data-flow models. Tab. 3 shows the results of the seven largest contexts occurred in the case study. The first column shows the size of the context containing negated attributes. #F and #R specifies the number of features and relations (options, alternatives, and ors) of the generated feature tree. Relations that are removed from the feature graph to get a tree are not included. The number of feature constraints, i.e., the implications that define dependencies among nonredundant features that are not covered by the feature tree yet, is shown in column #C. The feature models created from the contexts II and V do not need additional constraints, because in these cases all calculated implications are covered by feature relations yet. The columns td and ti

show the runtimes (on a single-core Pentium 4 at 3.2 GHz) of the extraction of the feature diagram and the calculation of the reduced implication set. Except for context VI, the runtimes are less than a few seconds. For comparison, tdg shows the runtimes to calculate the Duquenne-Guigues base (the minimal implication set) using [4]. Comparing ti to tdg for small contexts, our algorithm is only slightly faster than the existing one. But for larger contexts the difference is significant as shown for the contexts IV and VI, and especially for contexts II and V.

7.

RELATED WORK

Eisenbarth et al. [15] used the formal concept analysis to analyze the dependencies between predefined features and their required components, which are extracted by scenario traces. Therefore, the complete lattice is calculated using standard tools and it is used for the creation of mappings between features and components. But they do not analyze the dependencies among features itself, which is the topic of our work. Loesch et al. [16] used the formal concept analysis to check existing product configurations and their selected features. They analyze which features are always or never used, are mutual exclusive, or used in pairs. Also the implication base, calculated by standard tools, is used to create feature constraints. But, as mentioned above, the standard algorithms do not scale well. Also their approach does not generate feature models. In [17], Czarnecki and Wasowski extract feature models , from a set of propositional formulas, which are derived from existing feature models. In our work and also in many other library migration problems, there are neither preexisting feature models nor propositional formulas. And as our work shows, bridging the gap between the incidence matrix and propositional formulas is not trivial. However, there are some analogies in the algorithms: Using binary decision diagrams (BDD) and valid domains, they construct an implication graph to get the structure of the feature diagram. This is covered by the less complex creation of attribute concept graphs from contexts in this work. Next, they calculate prime implicants on BDDs to find minimal or-relations and alternatives, which corresponds to the minimal set-cover on extents in this work. In contrast to this work, they also support mandatory features, because they do not merge equivalent features. The arising decision problem of the parent-child order is solved by She et al. [18] using external information. Because of the strict merging to nonequivalent features and missing external information, that problem does not apply to our work.

8.

CONCLUSIONS

The described approach will create feature models, which specify exactly the variants given as input in the form of an incidence matrix. Using optimized methods based on the formal concept analysis, the feature models are generated in very short time, because they scale significantly better than the standard methods of the formal concept analysis to calculate the lattices and implication bases.

9.

REFERENCES

[1] Uwe Ryssel, Joern Ploennigs, and Klaus Kabitzsch. Automatic library migration for the generation of hardware-in-the-loop models. Science of Computer Programming, 2010.

[2] Uwe Ryssel, Joern Ploennigs, and Klaus Kabitzsch. Automatic variation-point identification in function-block-based models. In Int. Conf. on Generative Programming and Component Engineering, pages 23–32, 2010. [3] Bernhard Ganter and Rudolf Wille. Formal Concept Analysis — Mathematical Foundations. Springer, 1999. [4] Bernhard Ganter. Two basic algorithms in concept analysis. In Int. Conf. for Formal Concept Analysis, pages 312–340, 2010. [5] Christian Lindig. Fast concept analysis. In Working with Conceptual Structures — Contributions to ICCS 2000, pages 152–161. Shaker Verlag, 2000. [6] J. L. Guigues and V. Duquenne. Familles minimales d’implications informatives r´esultant d’un tableau de donn´ees binaires. Math´ematiques et Sciences Humaines, 95:5–18, 1986. [7] S. Obiedkov and V. Duquenne. Attribute-incremental construction of the canonical implication basis. Annals of Mathematics and Artificial Intelligence, 49(1–4):77–99, 2007. [8] J. La Poutr´e and J. van Leeuwen. Maintenance of transitive closures and transitive reductions of graphs. In Graph-Theoretic Concepts in Computer Science, volume 314 of LNCS, pages 106–120. Springer, 1988. [9] Don Batory. Feature models, grammars, and propositional formulas. In Int. Software Product Lines Conference, pages 7–20, 2005. [10] Marcilio Mendonca, Andrzej Wasowski, and Krzysztof , Czarnecki. SAT-based analysis of feature models is easy. In Int. Software Product Line Conference, pages 231–240, 2009. [11] Uwe Ryssel, Felix Distel, and Daniel Borchmann. Fast computation of proper premises. In Int. Conf. on Concept Lattices and Their Applications, 2011. Submitted. [12] Krzysztof Czarnecki and Michal Antkiewicz. Mapping features to models: A template approach based on superimposed variants. In Int. Conf. on Generative Programming and Component Engineering, pages 422–437, 2005. [13] Christian Lindig and Daniel G¨ otzmann. Colibri-Java — Formal concept analysis implemented in Java, 2007. http://code.google.com/p/colibri-java/. [14] Daniel Le Berre and Anne Parrain. The sat4j library, release 2.2. Journal on Satisfiability, Boolean Modeling and Computation, 7:59–64, 2010. [15] Thomas Eisenbarth, Rainer Koschke, and Daniel Simon. Derivation of feature component maps by means of concept analysis. In 5th Eu. Conf. on Software Maintenance and Reengineering, pages 176–179, 2001. [16] Felix Loesch and Erhard Ploedereder. Restructuring variability in software product lines using concept analysis of product configurations. In 11th Eu. Conf. on Software Maintenance and Reengineering, pages 159–170, 2007. [17] Krzysztof Czarnecki and Andrzej Wasowski. Feature , diagrams and logics: There and back again. In Int. Software Product Line Conference, pages 23–34, 2007. [18] Steven She, Rafael Lotufo, Thorsten Berger, Andrzej Wasowski, and Krzysztof Czarnecki. Reverse , engineering feature models. In Int. Conf. on Software Engineering, pages 461–470, 2011.

Suggest Documents