Reducing Quantization Error and Contextual Bias Problems in Object

2 downloads 0 Views 668KB Size Report
example, a candidate class is generally identified by applying the rule: If an entity in a ... quantization error problem in signal processing as an analogy to software ...... context to the development process as the contextual bias problem. 4.
Reducing Quantization Error and Contextual Bias Problems in Object-Oriented Methods by Applying Fuzzy-Logic Techniques Mehmet Aksit† and Francesco Marcelloni†† † TRESE project, Department of Computer Science, University of Twente, P.O. Box 217, 7500 AE Enschede, The Netherlands. email: [email protected], www server: http://wwwtrese.cs.utwente.nl †† Department of Information Engineering, University of Pisa, Via Diotisalvi, 2-56126, Pisa, Italy. email: [email protected] Abstract Object-oriented methods define a considerable number of rules which are generally expressed using two-valued logic. For example, an entity in a requirement specification is either accepted or rejected as a class. There are two major problems how rules are defined and applied in current methods. Firstly, twovalued logic cannot effectively express the approximate and inexact nature of a typical software development process. Secondly, the influence of contextual factors on rules is generally not modeled explicitly. This article terms these problems as quantization error and contextual bias problems, respectively. To reduce these problems, fuzzy-logic techniques are proposed. This approach is method independent and is useful for evaluating and enhancing current methods. In addition, fuzzy-logic based techniques increase the adaptability and reusability of design models. Quantization error and contextual bias problems, and the usefulness of the fuzzy-logic based approach in reducing these problems are illustrated first formally, and then intuitively through a simple example. The fuzzylogic based method is implemented and tested by using our fuzzy-logic based experimental CASE environment.

Index terms: Object-oriented methods, fuzzy-logic based reasoning, quantization error, adaptable design models, development environments, reuse

Correspondance address: E-mail: Fax: Tel:

Mehmet Aksit, University of Twente, P.O. Box. 217, 7500 AE, Enschede, The Netherlands. [email protected] +31-53-4893503 +31-53-4892638

1. Introduction During the last several years, a considerable number of object-oriented methods [5, 9, 15, 21] have been introduced to create robust, reusable and adaptable software systems. Object-oriented methods create software artifacts by exploiting object-oriented concepts1 through the application of a large number of rules. For example, OMT [21] introduces rules for identifying and discarding classes, associations, part-of and inheritance relations, statetransition and data-flow diagrams. Basically, these rules are based on two-valued logic. For example, a candidate class is generally identified by applying the rule: If an entity in a requirement specification is relevant then select it as a candidate class. We consider two major problems, termed as quantization error and contextual bias problems, in the way how rules are defined and applied in current object-oriented methods. Firstly, two-valued logic does not provide an effective means for capturing the approximate and inexact nature of a typical software development process. For example, to identify a class, the software engineer has to determine whether the entity being considered is relevant or not for the application domain. The software engineer may conclude that the entity partially fulfils the relevance criterion, and may prefer to use expressions like the entity is 70 percent, or substantially relevant. However, two-valued logic forces the software engineer to take abrupt decisions, such as accepting or rejecting the entity as a class. This results in loss of information because the information about the partial relevance of the entity is not modeled and therefore in the subsequent phases cannot be considered explicitly. Secondly, the validity of a rule may largely depend on contextual factors such as the application domain, changes in user’s interest and technological advances. Unless the contextual factors that influence a given rule are defined explicitly, the applicability of that rule cannot be determined and controlled effectively. This article has three contributions. Firstly, it introduces the notion and mathematical formulation of the quantization error problem which can be used to analyze, compare and improve current object-oriented methods. Secondly, a new fuzzy-logic based objectoriented software development technique is introduced to reduce quantization errors. This technique is not specific to a particular object-oriented method. In addition, the fuzzy-logic based technique increases the adaptability and reusability of design models. Finally, the influence of contextual factors on rules is explicitly modeled and controlled by dynamically adapting the domain of contextual variables. Quantization error and contextual bias problems, and the usefulness of the fuzzy-logic based approach in reducing these problems are illustrated first formally, and then intuitively through a simple example. The fuzzy-logic based method is implemented and tested by using our experimental CASE environment [7].

1

The term concept refers to the types of software artifacts of object-oriented development process. Typical examples of object-oriented concepts are Class, Object, Association, Part-of relation, Inheritance relation, Attribute, Operation, State-transition diagram.

2

This paper is organized as follows. The next section introduces the notion and the formulation of quantization errors in object-oriented methods. Further this section illustrates the quantization error problem by the help of a simple example. Section 3 defines the contextual bias problem. Section 4 identifies the requirements to reduce quantization error and contextual bias problems. Section 5 introduces the fuzzy-logic based software development technique and illustrates its applicability by using a simple example. Evaluation of the approach is presented in section 6. Section 7 summarizes the related work. The future work is described in section 8. Finally, section 9 gives conclusions. 2. The Quantization Error Problem This section defines the quantization error problem in current object-oriented methods. The application of two-valued logic in software development is explained in section 2.1. Most methods adopt this kind of reasoning, generally in an informal way. Section 2.2 explains the quantization error problem in signal processing as an analogy to software development process. Section 2.3 models the quantization error in two-valued logic based methods. An evaluation of the accuracy of the formulas is presented in section 2.4. Section 2.5 illustrates the effect of the quantization error by means of a simple example. 2.1 Two-valued Logic Based Rules Assume that the following rule is used to identify candidate classes: IF AN ENTITY IN A REQUIREMENT SPECIFICATION IS RELEVANT AND CAN EXIST AUTONOMOUSLY IN THE APPLICATION DOMAIN THEN SELECT IT AS A CANDIDATE CLASS.

Here, an entity in a requirement specification and a candidate class are the two objectoriented concepts to be reasoned. The term THEN separates the antecedent and the consequent of the rule. The antecedent consists of two conditions composed by the connective AND. Relevant and Autonomously are the input values for the first and second conditions, respectively. In two-valued logic, a consequent is true if its antecedent is true. If the consequent is true, then the result of this rule is the classification of an entity in a requirement specification as a candidate class. For illustration purposes, we will refer to similar rules which are commonly adopted by object-oriented methods. After identifying candidate classes, redundant classes can be eliminated for instance by using rule Redundant Class Elimination: IF TWO CANDIDATE CLASSES EXPRESS THE SAME INFORMATION THEN DISCARD THE LEAST DESCRIPTIVE ONE.

Here, rule Candidate Class Identification is coupled to rule Redundant Class Elimination; two rules are coupled if the result of one rule is the input of another rule. In general, application of a rule quantizes a set of object-oriented artifacts into two subsets: accepted or rejected. Once an artifact has been classified, for instance into the rejected set of a rule, it is not considered anymore by the rules that apply to the accepted set of that rule. For example, after applying rule Candidate Class Identification, if an entity in a requirement specification is not selected as a candidate class, then this entity will not be considered by rule Redundant Class Elimination. Of course, a rejected entity can be

3

considered by another rule which applies to the entities in a requirement specification. Consider, for example, rule Attribute Identification: IF AN ENTITY IN A REQUIREMENT SPECIFICATION IS RELEVANT AND CANNOT

EXIST AUTONOMOUSLY IN

THE APPLICATION DOMAIN, THEN IDENTIFY IT AS AN ATTRIBUTE.

This rule can be applied to the entities in a requirement specification which are rejected by rule Candidate Class Identification. If all the rules which are applicable to an entity in a requirement specification reject that entity, then the entity is practically discarded. 2.2 An Analogy: Quantization Process of a Sampled Signal We think that the quantization process as defined by current methods is problematic and generates a high quantization error. To make this clear, we refer to the area of digital signal processing because quantization errors have been extensively studied in this field [22]. In digital signal processing, the quantization process assigns the amplitudes of a sampled analog signal to a prescribed number of discrete quantization levels. This results in a loss of information because the quantized signal is an approximation of the analog signal. The quantization error is defined as the difference between an analog and the corresponding quantized signal sample. Figure 1(a) shows a sampled signal before the quantization process. The Y axis indicates the amplitude of the signal which varies between 0 and A. The X axis shows the signal samples. In digital signal processing, the X axis typically indicates time. Figure 1(b) shows the signal after the quantization process. Here, there are eight uniformly separated quantization levels. The amplitude of signal samples is approximated to the levels that are considered most appropriate. Signal

Signal

A

A

0

1

2

3

4

5 (a)

6

7

8

0 Samples

1

2

3

4

5

6

7

8

Samples

(b)

Fig. 1. Quantization process: (a) sampled signal (b) quantized signal.

In the following, we will formulate the error in the quantization process of a sampled signal. In the next section, we will use these formulas as a basis to calculate quantization errors in object-oriented methods. The quantization error is directly related to the difference between adjacent discrete amplitude levels. The error can be reduced to any desired grade by choosing the difference between levels small enough. Figure 2 shows an example of a quantization process. Here, Aj is the amplitude of level j, and Uj and Lj are the upper and lower threshold values of level j, respectively.

4

Signal AN=A

Uj Aj d/2 Lj d/2 A0=0

threshold level threshold

ε

ε

2

1

Samples

Fig. 2. Quantization error with N quantization levels.

Now assume that the amplitude of a signal which varies in the range of uncertainty between Lj and Uj is approximated to level j. In digital signal processing, the root mean square value of the quantization error is generally considered as a good index to compare different quantization processes. If the amplitude distribution of the signal is known, the root mean square value of the quantization error generated by quantizing the amplitude to a generic level j can be computed by using formula (1):

εj =

∫ (a − A ) j

2

pd (a| Output = A j ) da (1)

Dj

where a is the amplitude of the signal, Dj = U j − L j is the domain of the signal when the output of the quantization process is Aj, pd (a| output = Aj ) is the probability density function of a conditioned upon the event output of quantization process = Aj. Suppose that over a long period of time all possible amplitude values of the signal appear 1 the same number of times. Then, pd (a| Output = Aj ) = . Formula (1) then becomes U j − Lj

εj =



Uj

Lj

( a − Aj )2

((U − A ) − ( L − A ) ) 3 ⋅ (U − L )

1 da = U j − Lj

1

3

j

j

j

3

j

j

(2)

j

Let us assume that N is the number of quantization levels and A is the maximum value of the signal. If the quantization levels are uniformly separated, then the distance d between two adjacent levels is equal to A/(N-1). For each quantization level j, Uj - Lj is equal to d. Then formula (2) can be reduced to

εj =

2

1  d 1 A ⋅ ⋅  = 3  2 2( N − 1) 3

(3)

It is clear from (3) that root mean square value of the quantization error decreases with the increase of the number of quantization levels. The global root mean square value of the signal can be computed as

ε=

N −1

∑ (ε j =0

2 j

)

⋅ p(Output = Aj )

(4)

where p(Output = Aj) is the probability to have Aj as output of the quantization process.

5

If the amplitude of the signal is uniformly distributed, then ε = ε j =

A 1 ⋅ . 2( N − 1) 3

2.3 Quantization Error in Two-Valued Logic Based Methods In two-valued logic based software development methods, high quantization errors arise from the fact that rules adopt only two quantization levels. For example, rule Candidate Class Identification requires from the software engineer to decide whether an entity in a requirement specification is relevant or not. The software engineer may, however, conclude that an entity partially fulfils the relevance criterion, and may prefer to use expressions like the entity is substantially relevant, or the entity is 70 percent relevant. This is quite similar to the quantization error in signal processing. If we adopt, for instance, Figure 2 in representing the quantization process of object-oriented methods, then the Y axis represents the relevance of an entity and the X axis indicates the entities being considered. We suppose that the relevance can assume values between 0 and 1. If for example, the software engineer concludes that an entity in a requirement specification is 70 percent relevant, then 0.70 has to be approximated to 0 or 1. If 1 is selected, then the value of the quantization error will be 0.3. This means that there is a loss of information about the relevance of the entity. We call this the quantization error problem of object-oriented methods. Assume that if the relevance value is between 0 and 1/2, and 1/2 and 1, it is approximated to 0 and 1, respectively. Suppose that over a long period of time all possible relevance values appear the same number of times. We can compute the root mean square value of relevance by applying formula (3). This computation results in 0.289. As methods are composed of a set of rules, we are also interested in calculating quantization errors which affect the result of rules. In calculating the quantization error, we suppose that the truth value of a condition varies between 0 (false) and 1 (true), and the truth value of the consequent increases linearly with the increase of the product among the truth values of the conditions. Consider a rule whose antecedent is composed of n conditions. Let ci be the truth value of the condition i and let pd(c1, c2, ..., cn) be the joint probability density of c1, c2, ..., cn. We can calculate the root mean square value of the quantization error when the result is approximated to Aj as

εj =



Dj

(c − Aj ) 2 ⋅ pd(c | (Output = Aj )) dc

(5)

where D j indicates the domain on which the truth values of conditions vary when the result of the rule is Aj,

and

c,

pd(c | Output = Aj )

and

pd(c1 | (Output = Aj ), c2 | (Output = Aj ),..., cn | (Output = Aj ))

dc and

denote

c1 ⋅ c2 ⋅...⋅cn ,

dc1 ⋅ dc2 ⋅...⋅dcn ,

respectively. Let us suppose that random variables c1, c2, ..., cn are independent. Then, n

pd(c | (Output = Aj )) = ∏ pd (ci |(Output = Aj )) . i=1

6

We observe that for most rules the truth values of conditions and consequent increase linearly with the increase of input values and result, respectively. Further, for simplicity, we assume that over a long period of time all possible truth values of the conditions appear the 1 , where U ij − Lij is the domain same number of times. Then, pd (ci |(Output = Aj )) = i i U j − Lj of the truth value of condition i when the output is Aj [20]. In two-valued logic, the consequent of a rule is true if the conditions in the antecedent of the rule are true. This means that the truth value of the consequent is approximated to 1 when all the conditions have the truth values between 1/2 and 1. In case Aj = 1 , 1 D1 = D11 × D12 ×...× D1n , where D1i = [ ,1] is the domain of ci when the output is 1. We can 2 now compute ε 1 by applying formula (5). Here, pd(c | (Output = 1)) = 2n . We obtain n n 1 1 1 7 3 ε 1 = 2n ∫1 ∫1 ... ∫1 (c − 1) 2 dc = 1 +   − 2 ⋅    4  12 2 2 2

(6)

Formula (6) shows that when the result is quantized to 1 (true), the root mean square value of the quantization error increases with the increase of the number of conditions in a rule and approaches to 1 when n → ∞ . As an example, let us calculate the root mean square value of the quantization error which affects the result of rule Candidate Class Identification by using formula (6) when an entity is accepted as a candidate class. By substituting n=2 in formula (6), we obtain 2 2 7 3 ε 1 = 1 +   − 2 ⋅   = 0.464  4  12 

It follows that each entity in the requirement specification identified as a candidate class in average matches only the 53.6 percent of the definition of an ideal candidate class. Now, we calculate the quantization error in case Aj = 0 . Theoretically, if the antecedent of a rule is false, it is not possible to infer that the consequent of the rule is false. Anyway, it can be often appropriate to make use of the closed-world assumption. This assumption means that anything which cannot be inferred as true from the given facts and the available rules is false. We can easily compute ε 0 from formula (5) by considering D0 = D − D1 , where D = D1 × D 2 ×...× D n and Di = [0,1] . Thus, n

7 1 −    1 8 ε 0 = ∫ c 2 ⋅ pd(c| Output = 0) dc − ∫ c 2 ⋅ pd(c| Output = 0) dc = n ⋅ n 3 1  D D1 1−    2 as pd (c| Output = 0) =

1 1−

1 2n

(7)

.

7

Formula (7) shows that, when the result is quantized to 0 (false), the root mean square value of the quantization error decreases with the increase of the number of conditions in a rule and approaches to 0 when n → ∞ . As an example, let us calculate the root mean square value of the quantization error which affects the result of rule Candidate Class Identification when an entity is discarded as a candidate class. By substituting n = 2 in formula (7), we obtain ε 0 = 0.186 . Now, we can calculate the global root mean square value of the quantization error which affects the result of a rule by applying formula (4): n

n

n

7 7 3 1 −   1 +   − 2 ⋅    8  12   4 ε = ε 12 ⋅ p(Output = 1) + ε 20 ⋅ p(Output = 0) = (8) + n n 3 2 For instance, the root mean square value of the quantization error which affects the result of rule Candidate Class Identification is ε = 0.283 . Also coupling of rules with multiple conditions affects the quantization error. Consider, for example, rules Rule1 and Rule2 whose antecedents are composed by n and m conditions, respectively. Suppose that Rule1 is coupled to Rule2, i.e., the consequent of Rule1 matches a condition of Rule2. This is like having the antecedent of rule Rule2 composed by n+m-1 conditions. 2.4 Evaluation of the Accuracy of the Formulas In this section we will evaluate the accuracy of the quantization error formulas from the following five perspectives: effect of quantization errors, validity of the formulas, interpretation of the formulas, effect of quantization policy and possible use of metrics. Effect of quantization errors The formulas presented in the previous section determine the root mean square value of the quantization error. However, they do not indicate that the resulting object-oriented model will have the same percentage of error. The measurement of error in the resulting object model requires detailed semantic analysis of the requirement specification and the object model2. It is however expected that loss of information will eventually cause errors in the resulting object-oriented model. In section 2.5, a possible effect of the quantization error will be illustrated by using a simple example. Validity of the formulas Formulas (4) and (5) are based on some general assumptions and therefore they are applicable to a large category of rules. Validity of formulas (6), (7) and (8) depends on the assumptions made about the relations between the input value and the truth value of the conditions, and the output value and the truth value of the consequent. In addition, assumptions made about the uniform distribution of the input values also affect these formulas. Obviously, other distribution functions could be used rather than the uniform

2

This issue is not the aim of this paper.

8

distribution. Further, since formulas are based on statistical assumptions, their validity increases after a long run and for applications with a large number of entities. We believe that the formulation of the quantization error is particularly useful in analyzing, comparing and for relatively improving methods. For this purpose, we consider that these assumptions are acceptable. Interpretation of the formulas If the coupling between rules is through the accepted subsets, the root mean square value of the quantization error can be computed by formula (6). This formula shows that the error value increases with the increase of number of couplings between rules whose antecedents are composed by two or more conditions and approaches to 1 when this number approaches to ∞. If the coupling occurs always through the rejected subsets, the error value can be computed by formula (7). This formula shows that the error value decreases with the increase of number of couplings between rules whose antecedents are composed by two or more conditions and approaches to 0 when this number approaches to ∞. In most methods, design models are created by classifying artifacts in accepted sets. In this case, the quantization error will increase with the number of couplings. If all the rules reject the entities, then the error will be less, but there will be no object model in the end. Therefore, it may not be meaningful to define object-oriented methods solely based on the quantization error calculations. The objectives of the design method must be considered as the main goal. However, design methods can be improved by reorganizing design rules. For example, it may be better to defer the elimination of an artifact until all the relevant information is collected. In such a way, the elimination is affected by a lower quantization error than an earlier elimination. Another observation is that classifying an entity to a rejected set does not only mean to classify that entity to the level 0 , but generally it also means discarding that entity. As a result, the discarded entity cannot be considered anymore by the subsequent rules. Effect of quantization policy In the quantization process of a sampled signal, if a value is between two quantization levels, a quantization policy must be adopted to classify the signal into one of the adjacent levels. If the difference between these levels is too large, then the policy becomes more crucial for the quality of the quantization process. Since two-valued logic based objectoriented methods have two quantization levels, the quantization policy should be as accurate and precise as possible. The quantization policy of a rule is determined by the conditions of the rule which establish a threshold value to quantize object-oriented artifacts into accepted or rejected sets. It is not always easy to define an ideal threshold value especially in the early phases of software development. In addition, during the interpretation of conditions, the software engineer may be requested to provide some information which is already quantized. For instance, in rule Candidate Class Identification, the condition IF AN ENTITY IN A REQUIREMENT SPECIFICATION IS RELEVANT requires from the software engineer to quantize the relevance of an entity as relevant and not relevant. Relevance of an entity depends on the intuition and experience of the software engineer. Since both the threshold value and the response of the software

9

engineer may be very subjective, it is a difficult task to define a very accurate and precise quantization policy. In practice, quantization policy can be improved by defining fine-grained dedicated rules for the targeted application domain. Possible use of metrics Using metrics in the conditions of rules may eliminate the necessity to have input quantized by the software engineer. Most object-oriented metrics aim at measuring the artifacts of an object-oriented development process in an objective way. For example, in [6] and [8], a number of metrics are defined to quantify a set of design artifacts such as operation inheritance and coupling. In general, the metrics of an artifact are computed by using a formula and result in numbers. A rule then must establish a threshold value to determine whether the measured artifact is acceptable or not. The quality of metrics and threshold value determine the quality of a quantization policy. A satisfactory metrics-based quantization policy may be too difficult or even impossible to determine. It is very hard to formulate accurate and precise metrics especially for the early phases of software development. Moreover, even if a satisfactory metrics-based quantization policy exists, the result of a rule remains a two-level quantization. 2.5 An Example for Illustrating the Quantization Error Problem To illustrate the quantization error problem in two-valued logic based object-oriented methods, we will first present a simple object-oriented method and then apply it to an example. 2.5.1 Description of a Simple Object-Oriented Method Our method consists of the following rules: R(1)

Candidate Class Identification: IF AN ENTITY IN A REQUIREMENT SPECIFICATION IS RELEVANT AND CAN EXIST AUTONOMOUSLY IN THE APPLICATION DOMAIN, THEN SELECT IT AS A CANDIDATE CLASS.

R(2)

Redundant Class Elimination: IF

TWO CANDIDATE CLASSES EXPRESS THE SAME INFORMATION THEN DISCARD THE LEAST

DESCRIPTIVE ONE.

R(3)

Attribute Identification: IF

AN

ENTITY

IN

A REQUIREMENT

SPECIFICATION

IS

RELEVANT

AND

CANNOT

EXIST

AUTONOMOUSLY IN THE APPLICATION DOMAIN, THEN IDENTIFY IT AS AN ATTRIBUTE.

R(4)

Class to Attribute Conversion: IF A CANDIDATE CLASS QUALIFIES ANOTHER CLASS, CLASS.

R(5)

THEN IDENTIFY IT AS AN ATTRIBUTE OF THAT

Aggregation Identification: IF CLASS A CONTAINS CLASS B, THEN CLASS A AGGREGATES CLASS B.

R(6)

Inheritance Identification: IF CLASS A IS A KIND OF CLASS B, THEN CLASS A INHERITS FROM CLASS B.

R(7)

Inheritance Modification: IN THE CLASS HIERARCHY, IF THE NUMBER OF IMMEDIATE SUBCLASSES SUBORDINATED TO A CLASS IS LARGER THAN 5, THEN THE INHERITANCE HIERARCHY IS COMPLEX.

10

The dependencies between these rules are shown in Figure 3.

Fig. 3. The dependencies between the rules of example analysis method.

This method takes the requirement specification as input and produces classes, attributes, and inheritance and aggregation relations as output. The method has to evaluate various rules before generating a model. For example, to identify an entity in a requirement specification as a class, the corresponding rules must be evaluated in the following order. First, rule Candidate Class Identification must accept the entity. Second, rules Redundant Class Elimination and Class to Attribute Conversion must reject the entity. To consider an 11

entity as an attribute, rule Attribute Identification must accept the entity. An attribute can also be identified by applying rule Class to Attribute Conversion which transforms candidate classes into attributes. To identify aggregation and inheritance relations, the entities involved in these relations must be first accepted as classes. Then, possible contains and is-a-kind-of relations between these classes have to be converted to aggregation and inheritance relations, respectively. Rules R(1) to R(6) are adopted by most object-oriented methods. Rule R(7) is taken from [8]. If this rule concludes that the inheritance hierarchy is complex, then the hierarchy may be modified. 2.5.2 Application of the Method Our example problem is described in the following: A graphics application provides tools for drawing a set of graphic elements such as points, lines, rectangles, circles, and squares. A point is defined by its coordinates. A line has two reference points. A rectangle can be defined by a reference point and a diagonal line. A circle can be characterized by its center and radius. A square can be defined by a reference point and a diagonal line. Each element has a color. For brevity, we will not describe the detailed properties of all the graphical elements. After inspecting the requirement specification and using noun extraction, the following entities are provided to rule Candidate Class Identification: Graphics-Application, Tool, Graphic-Element, Point, Line, Rectangle, Circle, Square, Coordinate, Reference-Point, Diagonal-Line, Center, Radius and Color. Rule Candidate Class Identification rejects entities Graphics-Application and Tool because they are not considered relevant for the application. Entity Color is rejected because Color qualifies other graphical objects and therefore is not considered as an autonomously existing entity. All other entities are selected as candidate classes. The rejected entities are evaluated by rule Attribute Identification. This rule accepts Color as an attribute because Color qualifies the graphic elements, but the rule rejects Graphics-Application and Tools. The following groups of candidate classes express similar information: (Square, Rectangle), (Line, Diagonal-Line, Radius), (Point, Reference-Point, Center). Rule Redundant Class Elimination eliminates Square, Diagonal-Line, Radius, Reference-Point and Center because they are considered less expressive than their equivalent candidate classes. Candidate classes Graphic-Element, Circle and Coordinate are not eliminated because there are no other candidate classes which express similar information. Rule Class to Attribute Conversion converts candidate class Coordinate to an attribute because Coordinate qualifies Point. Further, this rule selects Graphic-Element, Point, Line, Rectangle and Circle as classes. After the application of rule Aggregation Identification, the following aggregation relations are identified: Line, Rectangle and Circle aggregate Point. Rectangle and Circle aggregate Line. 12

Rule Inheritance Identification identifies a candidate inheritance relation between GraphicElement, and Point, Line, Rectangle and Circle. Rule Inheritance Modification does not affect the inheritance hierarchy. The object diagram of our graphics application is shown in Figure 4 using the OMT notation [21]. Graphic-Element

Inheritance relation:

Color

Point

Rectangle

Line

Circle

Coordinate

Aggregation relation: Rectangle

Point

Circle

Line

Point

Line

Line

2 Point

Fig. 4. The object diagram of the graphics application using the OMT notation.

2.5.3 Possible Quantization Error Problems in the Example If the software engineer realizes that the resulting object model is not satisfactory, then there are two possible options: improving the model by applying subsequent rules and/or by iterating the process. The application of subsequent rules may not adequately improve the model because of the loss of information due to quantization errors. The iteration of the process still suffers from the quantization error problem. Moreover, managing an iteration remains as a difficult task. In the following section, we evaluate the object model of our graphics application. We focus on the desired changes necessary to improve the model. In particular, two kinds of changes are perceived: reincarnation of eliminated artifacts and conversion of artifacts. Reincarnation of eliminated artifacts Some artifacts which were discarded during the analysis process may be found out to be relevant artifacts in the later phases. Suppose that later in the design process, we realize that sometimes an explicit representation of square shapes may be necessary. For instance, in pattern recognition systems or in tutoring systems used for teaching geometry, it may be desirable to represent square shapes as a class. However, candidate class Square was considered redundant by the quantization process carried out by rule Redundant Class Elimination. Iteration of the process seems to be the only means to identify and reconsider class Square3.

3

One may claim that an experienced designer should not eliminate Square from the class repository. The purpose of this example is to illustrate the possibility of eliminating artifacts of the object-oriented development which could be considered relevant in the later phases. Since not all the entities in a realistic requirement specification can be

13

Conversion of artifacts Application of an object-oriented method classifies entities in a requirement specification into object-oriented concepts such as classes, attributes, operations. During the development process, the software engineer may discover that an entity could have been better classified into a different concept than the current concept. This requires conversion of entities from one concept to another. Assume that the operation to display a graphic element is based on a set of sophisticated color processing operations. In our object model Color is classified as an attribute. A practical implementation of this attribute will probably be an instance of class String. However, color processing operations demand a more complex object structure. Therefore, it would be quite reasonable to define Color as a class. To do this, the software engineer is forced to convert Color from an attribute to a class and iterate the process starting from rule Candidate Class Identification. Similar problems can occur with relations. Consider, for example, the aggregation relation between classes Rectangle, and Point and Line. By inspecting the interface operations of these classes, one can deduce that Rectangle shares some of its operations with Point and Line. To improve reusability, instead of using an aggregation relation, we could have defined an inheritance relation between Rectangle, and Point and Line4. In case of inheritance, Point and Line correspond to the reference point and the diagonal line of Rectangle, respectively. In case of aggregation, however, Rectangle has to explicitly declare all the operations of Point and Line, since inheritance provides transitive reuse of the inherited operations whereas aggregated parts are normally encapsulated within the aggregating class. Similar conversions can be applied to other aggregation relations. For example, class Line might inherit from class Point, and class Circle might inherit from classes Point and Line. During the analysis process, the inheritance relation between class Rectangle and classes Point and Line was discarded by the quantization process implemented by rule Inheritance Identification. Again, iteration of the analysis phase is necessary to convert the aggregation relation into inheritance.

(or should be) included in the object model, unwanted elimination of artifacts in two-valued logic based methods is, in principle, inevitable; a decision for elimination or acceptance is regularly an intuitive decision. 4

One may claim that conceptually Rectangle must not inherit from Point because Point is-a-part-of Rectangle and Rectangle is-not-a-kind-of Point. While we agree on this statement from a conceptual modeling point of view, such an inheritance relation can be considered valid from a reuse point of view: If a class C1 includes a part p of class C2, and if the operations of C2 are visible at the interface of C1, then semantically C1 inherits from C2. The visibility of the operations of p can be realized in three ways. The first approach is to declare all the operations of C2 at the interface of C1. C1 can then forward the corresponding requests to p of class C2. This approach is used, for example in the Bridge Pattern [10]. The second approach is to use a delegation mechanism as adopted by the languages Self [23] and Sina [2]. If the whole object delegates to its part, then semantically the whole object inherits from its part. The third option is to consider is-a-part-of relation as a conceptual specification, and use inheritance relation as an implementation of this is-a-part-of relation. Concluding, conversion of relations in two-valued logic based methods will be likely to occur because sometimes part-of relations may be converted to inheritance relations for reusability purposes. Moreover, the semantics of a relation can be context dependent and/or the exact semantics of a relation may not be completely understood at the time when it is defined.

14

3. The Contextual Bias Problem Contextual factors may influence validity of the result of a rule in two ways. Firstly, the input of a rule can be largely context dependent. In rule Redundant Class Elimination, for instance, the elimination of a class is based on the perception of the software engineer whether he or she finds a candidate class more descriptive than an equivalent class. Secondly, validity of a rule may largely depend on contextual factors such as application domain, changes in user’s interest and technological advances. Assume that in our graphics application, new graphic elements are introduced such as Ellipse, Triangle, Trapezoid. Each graphic element inherits from class Point. However, rule Inheritance Modification advises against this solution because a superclass must not have more than five subclasses. The problem is that the success of this rule heavily depends on the type of application. For example, in graphics applications, it appears natural that many classes inherit directly from class Point. This is because class Point represents a very basic abstraction in a graphic processing system. Using metrics based rules may not eliminate the effects of context either. As some authors indicate [4], metrics must be associated with some interpretation to determine the threshold of a design rule. But this interpretation must be given in a context. Only when the variables which can influence the measure are fixed, the interpretation of the metrics becomes univocal. Otherwise, the result is either an improper interpretation or a large amount of possible interpretations. As an example, let us consider the Chidamber and Kemerer's [8] interpretations of the experimental results obtained by collecting their metrics in two different software organizations. They observe that contingency factors are responsible for some marked differences of the values obtained by the two different samples. But they do not quantify how much and how these factors influence the metrics. We term the effects of context to the development process as the contextual bias problem. 4. Requirements for Reducing Quantization Error and Contextual Bias Problems In section 2.4, it was stated that the quantization error could be reduced by defining finegrained dedicated rules for the targeted application domain. In addition, in this section we propose the following four requirements in reducing the quantization error and contextual bias problems: • Increase the number of quantization levels: As shown in section 2, the quantization error can be lowered by increasing the number of quantization levels. Consider, for example, rule R(1) of section 2.5.1: IF AN ENTITY IN A REQUIREMENT SPECIFICATION IS RELEVANT AND CAN EXIST AUTONOMOUSLY IN THE APPLICATION DOMAIN, THEN SELECT IT AS A CANDIDATE CLASS.

In two-valued logic based methods, the relevance of an entity can be expressed only as relevant or not relevant. To increase the number of quantization levels, we have to split the range of relevance into more levels such as weakly, slightly, fairly, substantially and strongly relevant. Having more levels decreases the quantization error and makes the method less sensitive to the quantization policy. • Avoid early elimination of artifacts: As shown in section 2.5.3, too early elimination of artifacts may cause undesired effects on the resulting object model. Each decision taken 15

by a rule is based on the available information up to that phase. For the early phases, there may not be sufficient amount of information available to take abrupt decisions like discarding an entity. Such an abrupt decision must be taken only if there is a sufficient evidence that the entity is indeed irrelevant. In most object-oriented methods, however, each identification process is followed by an elimination process. For example, the OMT method [21] proposes a process that includes class identification and elimination, association identification and elimination, aggregation identification and elimination, and so on. Now, assume that a software engineer discards an entity because it is considered nonrelevant. The discarded entity, however, could have been included as a candidate class, if the software engineer had gathered more information about its structure and operations. During the later phases this would be practically impossible because the discarded entity could not be considered any further. Early elimination of artifacts in two-valued based methods is practically inevitable. • Explicit modeling of the influence of the context: To reduce the contextual bias problem, a method must consider the influence of context. Only when the context variables which can influence the result of a rule are fixed, the application of that rule produces reliable results. • Quality versus cost: Designing a signal processing system basically deals with finding a trade-off between cost and precision. This enables designers to tune the system architecture with respect to the cost-quality requirements. For example, in practice 128 levels are used for telephone speech transmission, whereas 65,536 levels are used for CD digital audio system. In current object-oriented methods, however, the relation between the cost and quality of a method is not clear. Therefore, unlike digital signal processing systems, there is no leverage for the software engineer to have more quality in expense of higher costs. If the number of quantization levels is increased, then the cost of having multiple levels with respect to the quality of the method must be estimable. 5. Using Fuzzy-Logic in Object-Oriented Methods This section explains the use of fuzzy-logic in object-oriented methods. Section 5.1 defines a model to represent multiple quantization levels. Section 5.2 describes the use of fuzzylogic based rules in object-oriented methods. A formulation of the quantization error in fuzzy-logic based methods is given in section 5.3. Section 5.4 illustrates the applicability of a simple fuzzy-logic based method by using an example. Section 5.5 explains how fuzzylogic based techniques can be used to model the effect of context on rules. 5.1 Multiple Quantization Levels Assume that each concept (artifact type) is defined as [C, (P1, D1), (P2, D2),...,(Pn, Dn)] where C is the concept name, Pi is a property of C and Di is the definition domain of Pi. An example of a concept is [Entity, (Relevance, {True, False}), (Autonomy, {True, False})]. Here, True and False are the two values that Relevance and Autonomy can assume. A software artifact is an instantiation of its concept and can be expressed as [C, id, (P1: V1), (P2:

16

V2),...,(Pn: Vn)], where C is the concept of the artifact, id is the unique identifier of the

artifact, and Vi is a value defined in domain Di of property Pi. Artifacts can be also named. In the following example, Rectangle is the name of the artifact: Rectangle ← [Entity, id, (Relevance: True), (Autonomy: True)]

The number of quantization levels can be increased by introducing more values than True and False. For example, the software engineer may select Relevance and Autonomy values from a number of alternatives as illustrated by the following example: [Entity, (Relevance, {Weakly, Slightly, Fairly, Substantially, Strongly}), (Autonomy, {Dependently, Partially Dependently, FullyAutonomously})]

To represent multiple levels conveniently, rules defined in section 2.5.1 have to be modified. Consider, for example, the modified rule Candidate Class Identification: IF CAN

AN

ENTITY EXIST

IN

A

REQUIREMENT

AUTONOMY

VALUE

SPECIFICATION

AUTONOMOUS

IN

IS

RELEVANCE THE

VALUE

APPLICATION

RELEVANT DOMAIN,

AND THEN

SELECT IT AS A RELEVANCE VALUE RELEVANT CANDIDATE CLASS.

Here, an entity and a candidate class are the concepts to be reasoned, Relevance and Autonomy are the properties, and relevance value and autonomy value indicate the domains of these properties. Now, assume that relevance value represents the set of values {Weakly, Slightly, Fairly, Substantially, Strongly}, and autonomy value represents the set of values {Dependently, Partially Dependently, Fully Autonomously}. Using these values, rule Candidate Class Identification can be represented in the following way: P ← [Entity, id1, (Relevance: V1 ∈ {Weakly, Slightly, Fairly, Substantially, Strongly}), (Autonomy: V2 ∈ {Dependently, Partially Dependently, Fully Autonomously})] ⇒ P ← [CandidateClass, id2, (Relevance: V3 ∈ {Weakly, Slightly, Fairly, Substantially, Strongly})]

Here, P and symbol ⇒ indicate a generic artifact name and the implication operator, respectively. Each combination of relevance and autonomy values of an entity has to be mapped into one of the five candidate class relevance values. This requires in total 3 × 5 = 15 rules. We call these sub-rules. The following is an example of a sub-rule: P ← [Entity, id1, (Relevance: Strongly), (Autonomy: Fully Autonomously)] ⇒ P ← [CandidateClass, id2, (Relevance: Strongly)]

5.2 Fuzzy-Logic Based Rules In two-valued logic, each property value corresponds to a crisp (disjoint) set resulting in an abrupt separation of elements between different sets. For example, the relevance values Weakly, Slightly, Fairly, Substantially, Strongly of an entity in requirement specification correspond to crisp sets. One of the major problems of two-valued logic is that it does not provide an effective means for capturing the approximate, inexact nature of the software development process. Consider, for example, rule Inheritance Modification in section 2.5.1. P1 ← [Class, id1, (ImmediateSubclasses: V1 ∈ {Low, Medium, High})] ⇒ P2 ← [Inheritance, id2, (Complexity: V2 ∈ {Low, Medium, High})]

This rule has to be decomposed into 3 sub-rules:

17

P1 ← [Class, id11, (ImmediateSubclasses: High)] ⇒ P2 ← [Inheritance, id21, (Complexity: High)] P1 ← [Class, id12, (ImmediateSubclasses: Medium)] ⇒ P2 ← [Inheritance, id22, (Complexity: Medium)] P1 ← [Class, id13, (ImmediateSubclasses: Low] ⇒ P2 ← [Inheritance, id23, (Complexity: Low)]

It is practically impossible to define crisp boundaries when the number of immediate subclasses is Low, Medium or High. If, for example, we assert that the number of subclasses is High if it is larger than 10, then we should admit that if the number of immediate subclasses is 9, the complexity of inheritance is not High. But a difference of 1 does not look as a distinguishing characteristic. The transition between membership and non-membership appears gradual rather than abrupt. On the other hand, in two-valued logic defining this crisp boundary is necessary because a proposition can be either true or false. Hence, an inheritance hierarchy must be characterized by either Low, Medium or High Complexity. It is therefore worth to investigate other forms of logic than two-valued logic. In fuzzy logic, the concept of vagueness is introduced by the definition of fuzzy set. A fuzzy set S of a universe of discourse U is characterized by a membership function µ S :U ⇒ [ 0,1] which associates with each element y of U a number µ S ( y) in the interval [0,1] which represents the grade of membership of y in S [25]. Based on the definition of fuzzy sets, the concept of linguistic variables is introduced to represent a language typically adopted by a human expert. A linguistic variable is a variable whose values, called linguistic values, have the form of phrases or sentences in a natural language [25]. For instance, property Relevance of an entity in requirement specification can be modeled as a linguistic variable which might assume linguistic values Weakly, Slightly, Fairly, Substantially and Strongly. Each linguistic value is associated with a fuzzy set that represents its meaning. Figure 5 shows a possible definition of these linguistic values. Here, the X and Y axes indicate the entities and their relevance values, respectively. In this figure, each linguistic value is shown as a different line type. Notice that in contrast to crisp sets, linguistic values are defined as partially overlapping membership functions.

18

A4=1

0.5

0

1

0.875 A3=0.75 0.625 A2=0.5.

Slightly Weakly

0.375

Strongly Substantially Fairly

Relevance

A1=0.25 0.125 A0=0 Entity i

Samples

Fig. 5. Five linguistic values defined by membership functions. Each membership function is shown as a different line type. For example, Weakly is drawn as a dashed line, Fairly as a solid line, etc.

Figure 6 shows the definition of linguistic values Dependently, Partially Dependently, Fully Autonomously of property Autonomy. The X and Y axes indicate the entities and the Autonomy value, respectively. Similar to Figure 5, each membership function is drawn as a different line type. 1

0

0.5

0.75 0.50 0.25 0 Entity i

1

Fully Partially Autonomously Dependently Dependently

Autonomy

Samples

Fig. 6. Membership functions of Dependently, Partially Dependent and Fully Autonomously.

Consider the following rule expressed in fuzzy logic: IF X IS A THEN Y IS B, where X and Y are linguistic variables and A and B are linguistic values. The evaluation of this rule may result in intermediate values between 0 and 1 rather than Boolean values 0 or 1. Here, the implication of this rule is a fuzzy relation between the fuzzy sets corresponding to the propositions X IS A and Y IS B, rather than a connective defined by a truth table. If a crisp value is required as a result of a rule, the corresponding fuzzy set has to be defuzzified by using a defuzzification operation. Consider, for example, fuzzy rule Candidate Class Identification: P ← [Entity, id1, (Relevance: V1 ∈ {Weakly, Slightly, Fairly, Substantially, Strongly}) (Autonomy: V2 ∈ {Dependently, Partially Dependently, Fully Autonomously})] ⇒f P ←[CandidateClass, id2, (Relevance: V3 ∈ {Weakly, Slightly, Fairly, Substantially, Strongly})]

The definition of linguistic values Weakly, Slightly, Fairly, Substantially, Strongly are shown in Figure 5. The definition of linguistic values Dependently, Partially Dependently, Fully Autonomously are given in Figure 6. The symbol ⇒ f indicates a fuzzy-implication operator. This rule is composed of 15 fuzzy sub-rules, which are shown by table 1.

19

P ← CandidateClass, Relevance: Dependently

Weakly

P ← Entity, Relevance: Slightly Fairly Substantially

Strongly

Weakly

Weakly

Weakly

Slightly

Weakly

Partially Dependently

Weakly

Slightly

Slightly

Fairly

Fairly

Fully Autonomously

Weakly

Slightly

Fairly

Substantially

Strongly

P ← Entity, Autonomy:

Table 1. Relation between the input variables and the result of rule Candidate Class Identification.

Here, columns and rows represent the input values of the properties Relevance and Autonomy, respectively. Each element of the table, shown in italics, represents the output value of a sub-rule, which is the relevance value of the candidate class being considered. For example, if the relevance and autonomy values are respectively Strongly and Fully Autonomously, then the candidate class relevance value is Strongly. We selected these output values based on our intuition and knowledge on object-oriented methods. We will discuss the effect of the output values in section 5.3, when we formulate the quantization errors in fuzzy-logic based rules. The appendix gives a more detailed explanation of fuzzylogic based reasoning. 5.3 The Quantization Error in Fuzzy-Logic Based Methods 5.3.1 Quantization Error in Case of Linguistic Values as Input Consider the membership functions defined by Figure 5. Here, 0.125, 0.375, 0.625 and 0.875 correspond to the Y-axis values of the crossing points of the membership functions. If a linguistic value such as Slightly is selected, the actual value can be at any point defined by the membership function Slightly. To simplify the formulation of the quantization error, however, we make the following assumption: The software engineer’s perception is likely to be restricted to the values between the crossing points. For example, if the software engineer assumes that the actual value is less than 0.125, probably he or she would have used Weakly instead of Slightly. Therefore, we assume that the software engineer mentally limits the fuzzy set to the range of values at which the membership function associated with the fuzzy set takes higher values than the other membership functions. This means, for example, that Slightly is considered to be between 0.125 and 0.375. We would like to stress that this assumption is only taken to formulate the quantization error, and during the application of the fuzzy rules, no such restriction is applied. To formulate the quantization error for more than 2 quantization levels, we have to determine the threshold values and the amplitudes of the quantization levels. Now, assume that the crossing points are the threshold values. The amplitude of a quantization level can be considered as the defuzzified value of the corresponding fuzzy set. Referring to Figure 5, if the peak values of membership functions are used as the defuzzified values, then the quantization levels will be 0, 0.25, 0.5, 0.75 and 1. Assuming over a long period of time all possible X values appear the same number of times, by using formula (3) of section 2.2, with A = 1 and N = 5, the mean value of the quantization error is computed as 0.072.

20

Now, we calculate the quantization error which affects the result of a fuzzy rule. In case of two-valued logic, it is possible to define formulas such as (6), (7) and (8) for rules with n conditions. In fact, the truth values of all the conditions have two levels, one at 0 and the other at 1, and a threshold can be assumed to be at 0.5. Further, each combination of the truth values of the conditions which generate the result 0 or 1 is predefined. In case of fuzzy logic, however, the truth values of conditions can have a different number of levels or levels positioned at different amplitudes. Further, each level in the result of a fuzzy rule can be generated by many different combinations that are not predefined. For this reason, although formulas (4) and (5) continue to be valid, it is not possible to define formulas as (6), (7) and (8) for fuzzy rules with n conditions. Nevertheless, the quantization error of a fuzzy rule can be computed, if the definition of that rule is precisely known. Taking into account these considerations, to compare the two-valued logic and fuzzy-logic based rules, we will compute the root mean square value of the quantization error only for fuzzy rule Candidate Class Identification as defined by table 1. Firstly, we will compute the quantization error for each output level. Then, by using formula (4) we will calculate the global root square mean error. Finally, we will compare this result with the quantization error of the two-valued logic based rule. In general, each output level can be the result of applying more than a single rule. The root mean square value of a quantization error in case of applying R rules can be computed by using the formula R

∑ p(r

j ,i

εj =

) ⋅ ε 2j ,i

i=1

(9)

R

∑ p(r

j ,i

)

i =1

where p(rj,i) is the probability to apply rule rj,i and ε j ,i is the root mean square value of the quantization error generated by quantizing the result of rule i to the level j. The probability p(rj,i) can be computed by using the following formula: p(rj ,i ) =

C j ,i

∏  ∫

c= 1

Uc

Lc

pd (cc )dc

(10)

where pd (cc ) is the probability density of the truth value cc of the rule and C j ,i is the number of conditions of rule rj ,i . The value ε j ,i can be computed by formula (5) where Aj is the value of the level j of the result of the rule. Let us assume the same hypotheses as in case of two-valued logic: i) the random variables which represent the truth values of the conditions are independent, ii) the truth values of conditions and consequent increase linearly with the increase of input values and result, respectively, and iii) over a long period of time all possible truth values of the conditions appear the same number of times. As an example, we compute ε j ,i and p(rj ,i ) for the following sub-rule:

21

P ← [Entity, id1, (Relevance: Strongly), (Autonomy: Fully Autonomously)] ⇒f P ← [CandidateClass, id2, (Relevance: Strongly)]

The quantization levels of the result are defined in the same way as the ones of Figure 5. This means that linguistic value Strongly corresponds to the quantization of the result to 1. We

apply

formula (5) 1 pd (c1 |(Output = 1)) = and 1 − 0.875 ε 4,i = 0.195.

Dj = [0.875,1] × [0.75,1] , 1 pd (c2 |(Output = 1)) = . We 1 − 0.75

with

Aj = 1 , obtain

As the probability density of the truth values is uniform in the interval [0,1], pd (c1 ) = pd (c2 ) = 1 and p(r4,i ) = (1 − 0.875) ⋅ (1 − 0.75) = 0.031 . By computing ε j ,i and p(rj ,i ) in the same way for all the sub-rules in table 1, we obtain the following root mean square values of the quantization error when the result is quantized to the five levels: ε 0 = 0.066, ε 1 = = 0.11, ε 2 = = 0.147, ε 3 = 0.125 and ε 4 = 0.195, where levels 0, 1, 2, 3 and 4 correspond to Weakly, Slightly, Fairly, Substantially and Strongly. By applying formula (4), we obtain ε = 0.114. In case of two-valued logic, ε was computed as 0.283. This demonstrates that compared to two-valued logic based ones, fuzzy-logic based rules reduce the loss of information considerably. 5.3.2 Quantization Error in Case of Crisp Values as Input Instead of linguistic values, the software engineer may provide crisp values as input. A crisp value indicates a single point at the Y axis, which means that the input is determined with certainty. Providing a crisp value, however, may be very difficult especially for the early phases of software development process. Even though the software engineer may be allowed to select a crisp value such as the entity in the requirement specification is 0.3 relevant, the liability of this value is highly questionable. On the other hand, inputs to some rules can be estimated with an acceptable certainty. For example, the software engineer can precisely determine the input value of linguistic variable Number of Immediate Subclasses which is used in rule Inheritance Modification in section 5.2. In this rule, Low, Medium and High are the linguistic values of property Number of Immediate Subclasses. Having a crisp value as an input makes it possible to determine the grade of membership of the crisp value with respect to the fuzzy set associated with each linguistic value. Consider, for example, Figure 7 which shows the meanings of the three linguistic values Low, Medium and High. If the software engineer provides 5 as an input value, then the grade of membership with respect to linguistic values Low and Medium is 0.5 and with respect to High is 0. Having the grade of membership is similar to knowing the distance between the original signal and the amplitude of the quantization level. Practically, this means that the quantization error introduced by the rule can be negligible.

22

Number of Subclasses 0

0.5

1

High

A2=20 15

Medium

A1=10 5

Hierarchy i

Low

A0=0

Samples

Fig. 7. Membership functions for property Number of Immediate Subclasses.

5.3.3 Evaluation of the Accuracy of the Error Calculations in Fuzzy-Logic Based Methods In section 2.4, the accuracy of the quantization error calculations for two-valued logic based rules was analyzed with respect to the effect of quantization errors, the validity of the formulas, the effect of the quantization policy, and the possible use of metrics. All these evaluations are also valid for the fuzzy-logic based method. In addition to these, the accuracy of the formulas for fuzzy-logic based rules depends on the accuracy of the definitions of the membership functions, the assumptions about the threshold and quantization levels, the definition of the fuzzy connectives AND and OR, and the fuzzy implication operator. Similar to two-valued logic based rules, it may not be meaningful to define fuzzy-logic based rules solely for the purpose of reducing quantization errors. For example, when applying fuzzy rule Candidate Class Identification, in case an entity is strongly relevant and fully autonomous, selecting that entity as a substantially relevant candidate class would have created a lower quantization error than selecting it as a strongly relevant candidate class. However, we think that the objective of a rule must be considered as the main goal; it is more logical to select the output value as Strongly when both input values have their highest possible values. Concluding, we particularly consider the formulation of the quantization error useful in analyzing, comparing and for relatively improving methods. In section 6 we will compare fuzzy-logic and two-valued logic based methods by using an example. 5.4 Application of a Simple Fuzzy-Logic Based Method In this section, we will modify the two-valued logic based method of section 2.5 into a fuzzy-logic based method and apply it to the graphics application. 5.4.1 Description of the Method Candidate Class Identification: This rule was defined in section 5.2. R(1)

R(2)

Redundant Class Elimination: IF CANDIDATE CLASS P1 IS RELEVANCE VALUE RELEVANT AND CANDIDATE CLASS P2 IS RELEVANCE VALUE RELEVANT AND INFORMATION OF P1 IS EQUIVALENCE VALUE EQUIVALENT TO INFORMATION OF P2 AND

23

P1 IS EXPRESSIVENESS VALUE MORE DESCRIPTIVE THAN P2 THEN SELECT P1 AS A RELEVANCE VALUE NON-REDUNDANT CANDIDATE CLASS.

P1 ← [CandidateClass, id1, (Relevance: V1 ∈ {Weakly, Slightly, Fairly, Substantially, Strongly})] ∧ f P2 ← [CandidateClass, id2, (Relevance: V2 ∈ {Weakly, Slightly, Fairly, Substantially, Strongly})] ∧ f P3 ← [InformationEquivalence, id3, (Between: id1 ∈ CandidateClass), (And: id2 ∈ CandidateClass), (Relevance: V1 ∈ {Weakly, Slightly, Fairly, Substantially, Strongly})] ∧ f P4 ← [Descriptive, id4, (More: id1 ∈ CandidateClass), (Than: id2 ∈ CandidateClass), (Relevance: V1 ∈ {Weakly, Slightly, Fairly, Substantially, Strongly})] ⇒f P1 ← [Non-RedundantCandidateClass, id5, (Relevance: V1 ∈ {Weakly, Slightly, Fairly, Substantially, Strongly})]

Here, symbol ∧f represents a fuzzy AND connective. This rule has to be decomposed into sub-rules and requires a four dimensional representation because it has four input variables. We present only the sub-rules which are used in the sample example. Table 2 shows the sub-rules when [InformationEquivalence, id3, (Between: id1), (And: id2), (Relevance: Strongly)] ∧ f [CandidateClass, id2, (Relevance: Strongly)]. Further, if [InformationEquivalence, id3, (Between: id1), (And: id2), (Relevance: Weakly)], candidate class P1 is selected as a non-redundant candidate classes with the same relevance value as it had before applying this rule. P1 ← CandidateClass, Relevance: P1 ← NonRedundant CandidateClass, Relevance: Weakly Slightly Fairly Substantially Strongly

Weakly

Slightly

Fairly

Substantially

Strongly

Weakly Weakly Weakly Weakly Weakly

Weakly Weakly Weakly Slightly Slightly

Weakly Weakly Slightly Slightly Fairly

Weakly Slightly Slightly Fairly Substantially

Weakly Slightly Fairly Substantially Strongly

P4 ← Descriptive, Relevance:

Table 2. Relation between the input variables and the result of R(2). R(3)

Attribute Identification: IF AN ENTITY IN A REQUIREMENT SPECIFICATION IS RELEVANCE VALUE RELEVANT AND CAN EXIST AUTONOMY VALUE AUTONOMOUS IN THE APPLICATION DOMAIN, THEN SELECT IT AS A RELEVANCE VALUE RELEVANT ATTRIBUTE.

P← P←

[Entity, id1, (Relevance: V1 ∈ {Weakly, Slightly, Fairly, Substantially, Strongly}), (Autonomy: V2 ∈ {Dependently, Partially Dependently, Fully Autonomously})] ⇒f [Attribute, id2, (Relevance: V3 ∈ {Weakly, Slightly, Fairly, Substantially, Strongly})]

Table 3 shows the sub-rules of R(3).

24

P ← Entity, Relevance: P ← Attribute, Relevance: Fully Autonomously Partially Dependently Dependently

Weakly

Slightly

Fairly

Substantially

Strongly

Weakly Weakly Weakly

Weakly Slightly Slightly

Weakly Slightly Fairly

Weakly Fairly Substantially

Slightly Fairly Strongly

P ← Entity, Autonomy:

Table 3. Relation between the input variables and the result of R(3). R(4)

Candidate Class Attribute Conversion: IF NON-REDUNDANT CANDIDATE CLASS P1 IS RELEVANCE VALUE RELEVANT AND NON-REDUNDANT CANDIDATE CLASS P2 IS RELEVANCE VALUE RELEVANT AND

P1 QUALIFICATION VALUE QUALIFIES P2 THEN SELECT P1 AS A RELEVANCE VALUE RELEVANT ATTRIBUTE AND SELECT P1 AS A RELEVANCE VALUE RELEVANT CLASS.

P1 ← [Non-RedundantCandidateClass, id1, (Relevance: V1 ∈ {Weakly, Slightly, Fairly, Substantially, Strongly})] ∧ f P2 ← [Non-RedundantCandidateClass, id2, (Relevance: V2 ∈ {Weakly, Slightly, Fairly, Substantially, Strongly})] ∧ f P3 ← [Qualification, id3, (Qualifier: id1 ∈ Non-RedundantCandidateClass), (Qualified: id2 ∈ Non-RedundantCandidateClass), (Relevance: V3 ∈ {Weakly, Slightly, Fairly, Substantially, Strongly})] ⇒f P1 ← [Attribute, id4, (Relevance: V4 ∈ {Weakly, Slightly, Fairly, Substantially, Strongly})] ∧ f P1 ← [Class, id5, (Relevance: V5 ∈ {Weakly, Slightly, Fairly, Substantially, Strongly})]

Note that P1 denotes two artifacts id4 and id5. In fuzzy-logic based method an artifact name can refer to multiple artifacts of different concepts at the same time. When executable programs have to be generated, then this conflict must be resolved by selecting, for example, the artifact with the highest relevance value. However, in the design model, this conflicting situation may have been preserved, because the relevance of an artifact can change depending on various contextual factors. Table 4 shows the sub-rules of rule Candidate Class Attribute Conversion when [NonRedundantCandidateClass, id1, (Relevance: Strongly})] and the output artifact is an attribute. When the output artifact is a class, the sub-rules can be obtained by replacing respectively Weakly, Slightly, Fairly, Substantially and Strongly with Strongly, Substantially, Fairly, Slightly and Weakly in the table.

25

P2 ← Non-RedundantCandidateClass, Relevance

P1 ←

Weakly

Slightly

Fairly

Substantially

Strongly

Attribute, Relevance: Weakly

Weakly

Weakly

Weakly

Weakly

Weakly

Slightly

Weakly

Weakly

Weakly

Slightly

Slightly

Fairly

Weakly

Weakly

Slightly

Slightly

Fairly

Substantially

Weakly

Slightly

Slightly

Fairly

Substantially

Strongly

Weakly

Slightly

Fairly

Substantially

Strongly

P3 ← Qualification, Relevance:

Table 4. Relation between the input variables and the result of R(4). R(5)

Aggregation Identification: IF CLASS P1 IS RELEVANCE VALUE RELEVANT AND CLASS P2 IS RELEVANCE VALUE RELEVANT AND

P1 CONTAIN VALUE CONTAINS P2 THEN AGGREGATION BETWEEN P1 AND P2 IS RELEVANCE VALUE RELEVANT.

P1 ← [Class, id1, (Relevance: V1 ∈ {Weakly, Slightly, Fairly, Substantially, Strongly})] ∧ f P2 ← [Class, id2, (Relevance: V2 ∈ {Weakly, Slightly, Fairly, Substantially, Strongly})] ∧ f P3 ← [Containment, id3, (Container: id1 ∈ Class), (Contained: id2 ∈ Class), (Relevance: V3 ∈ {Weakly, Slightly, Fairly, Substantially, Strongly})] ⇒f P4 ← [Aggregation, id4, (Aggregator: id1 ∈ Class), (Aggregated: id2 ∈ Class), (Relevance: V4 ∈ {Weakly, Slightly, Fairly, Substantially, Strongly})]

This rule requires a three dimensional representation because it has three input variables. Table 5 shows the sub-rules in case [Class, id2, (Relevance: Substantially)]. P1 ← Class, Relevance: P4 ← Aggregation, Relevance: Weakly

Weakly

Slightly

Fairly

Weakly

Weakly

Weakly

Weakly

Weakly

Slightly

Weakly

Weakly

Weakly

Slightly

Slightly

Substantially

Strongly

Fairly

Weakly

Weakly

Slightly

Slightly

Fairly

Substantially

Weakly

Slightly

Slightly

Fairly

Substantially

Strongly

Weakly

Slightly

Fairly

Substantially

Substantially

P3 ← Containment, Degree:

Table 5. Relation between the input variables and the result of R(5). R(6)

Inheritance Identification: IF CLASS P1 IS RELEVANCE VALUE RELEVANT AND CLASS P2 IS RELEVANCE VALUE RELEVANT AND

P2 IS-A-KIND-OF VALUE IS-A-KIND-OF P1 THEN INHERITANCE BETWEEN P1 AND P2 IS RELEVANCE VALUE RELEVANT.

P1 ← [Class, id1, (Relevance: V1 ∈ {Weakly, Slightly, Fairly, Substantially, Strongly})] ∧ f P2 ← [Class, id2, (Relevance: V2 ∈ {Weakly, Slightly, Fairly, Substantially, Strongly})] ∧ f P3 ← [Is-a-kind-of, id3, (General: id1 ∈ Class), (Special: id2 ∈ Class),

26

(Relevance: V3 ∈ {Weakly, Slightly, Fairly, Substantially, Strongly})] ⇒f P4 ← [Inheritance, id4, (SuperClass: id1 ∈ Class), (SubClass: id2 ∈ Class), (Relevance: V4 ∈ {Weakly, Slightly, Fairly, Substantially, Strongly})]

Definition of this rule is similar to R(5); the output values of the sub-rules are the same as the ones of table 5. Here, the rows and columns indicate the degree of a is-a-kind-of relation between P1 and P2, and the possible relevance values of class P1, respectively. R(7) Inheritance Modification: This rule was defined in section 5.2. Actually, the relevance values of all the inheritance relations should have been taken into account. For the sake of simplicity, we adopt this simplified version only.

Figure 8 displays the dependencies among the rules of our simple fuzzy-logic based objectoriented method. Structures of the two-valued and fuzzy-logic based methods, shown respectively in Figures 3 and 8, are quite similar to each other. There are, however, a number of important differences. Firstly, each rule in the two-valued logic based method classifies the properties of a set of one or more concepts into two subsets: accepted or rejected. If an entity in a requirement specification is rejected, for instance by rule Candidate Class Identification, it is not considered any more by rule Redundant Class Elimination. If an entity is rejected by rules Candidate Class Identification and Attribute Identification, then it is practically discarded. In the fuzzy-logic based method, however, each rule gives various grades of property values to a set of concepts. For example, if an entity is accepted as a weakly relevant candidate class by fuzzy rule Candidate Class Identification, it is still considered by fuzzy rule Redundant Class Elimination. In other words, in the fuzzy-logic based method, none of the entities are fully accepted or rejected; each entity is forwarded to the coupled rules with a grade of property values. As a consequence, each entity is stored in both the class and attribute repositories with possibly different property values. In the fuzzy-logic based method, therefore, in addition to reducing information loss by introducing more quantization levels, the loss of information is further decreased since no products are discarded. Since the fuzzy-logic based method does not eliminate any concept, various different object models can be obtained for the same problem. Conflict resolution techniques must be introduced to select the most appropriate object model within a given context. Unlike the fuzzy-logic based method, the two-valued logic based method generates only a single object model for a given problem. In addition, in the fuzzy-logic based method, it is possible to tune the effects of the individual rules by applying weighting factors to the results of the rules. In Figure 8, the weighting factors are represented by Wi. It is likely that the weighting factors used in later phases of software development are higher than the ones in the earlier phases. This is because more accurate and precise information is expected in the later phases.

27

Fig. 8. The dependencies between the rules of the fuzzy-logic based method.

5.4.2 Analyzing the Graphics Application Example by Using the Fuzzy-Logic Based Method In the following, we will apply fuzzy rules R(1) to R(7) to the graphics application given in section 2.5.2. The entities provided to fuzzy rule Candidate Class Identification are the same as the ones provided to the two-valued logic based method illustrated in section 2.5.2. These are Graphics-Application, Tool, Graphic-Element, Point, Line, Rectangle, Circle, Square, Coordinate, Reference-Point, Diagonal-Line, Center, Radius, and Color. Using table 1, 28

fuzzy rule Candidate Class Identification qualifies entities Graphics-Application and Tool as weakly relevant candidate classes because they are considered as weakly relevant entities and they can exist fully autonomously in the application domain. Entity Color is selected as a slightly candidate class because Color is a strongly relevant entity and can exist dependently. All the other entities are selected as strongly relevant candidate classes because they are all strongly relevant and can exist fully autonomously. Candidate classes Square, Rectangle, Line, Diagonal-Line, Radius, Point, Reference-Point and Center are all strongly relevant candidate classes. The group of classes whose information contents are strongly relevant are the following: (Square, Rectangle), (Line, Diagonal-Line, Radius) and (Point, Reference-Point, Center). Here, classes Rectangle, Line and Point are considered substantially descriptive with respect to their pair classes and referring to table 2, they are selected as substantially non-redundant candidate classes by fuzzy rule Redundant Class Elimination. Their pair classes are selected as slightly nonredundant candidate classes because they are slightly descriptive with respect to their pair classes. The candidate classes whose information contents are weakly equivalent are selected as non-redundant candidate classes with the same relevance values as they had before applying this rule. Fuzzy rule Attribute Identification, as defined by table 3, qualifies Color as a strongly relevant attribute since Color is strongly relevant and can exist dependently. GraphicsApplication and Tool are considered weakly relevant attributes. Further, this rule qualifies all the other entities in the requirement specification as slightly relevant attributes. Class Coordinate strongly qualifies -substantially relevant non-redundant candidate classPoint and therefore is selected as a substantially relevant attribute and as a slightly relevant class by fuzzy rule Class to Attribute Conversion as defined in table 4. All the remaining classes except Color are considered as weakly relevant attributes and they become classes with the same relevance values that they had as non-redundant candidate classes. Color remains as a strongly relevant attribute and becomes a weakly relevant class. The results of rules R(3) and R(4) show that a name can refer to more than one artifact of the same concept. Therefore, the result obtained by R(4) must be combined with the result obtained by applying R(3). If we consider equal weighting factors associated with rules R(3) and R(4), we obtain the following: Color is a strongly relevant attribute, the relevance of Coordinate as an attribute is defined by the composition of the membership functions Slightly and Substantially, and the relevance of all the remaining entities as an attribute is defined by the composition of the membership functions Weakly and Slightly. The composition of linguistic values Slightly and Substantially, and Weakly and Slightly are represented in Figures 9(a) and 9(b), respectively. Classes Graphic-Element and Circle are strongly relevant classes and they are considered strongly contain -substantially relevant class- Point, and therefore according to table 5, the aggregate relations between them and Point are substantially relevant. Classes Rectangle and Line are substantially relevant classes and they are considered strongly to contain Point and therefore the aggregate relations between them and Point are substantially relevant. Further, Circle and Rectangle strongly contain Line, and therefore the aggregate relations between them and Line is substantially relevant. Classes Square, Reference-Point, Diagonal-Line, Center Radius are slightly relevant classes and therefore the aggregation 29

relations between them and Point are slightly relevant. Similarly, the aggregation relation between Square and Line is slightly relevant. All the remaining possible aggregation relations are considered to be weakly relevant.

Fig. 9. Composition of values: 9(a) Slightly and Substantially, and 9(b) Weakly and Slightly.

Class Circle is a strongly relevant class and it is considered to be a strongly is-a-kind-of Graphic-Element. Since Graphic-Element is a strongly relevant class, the inheritance relation between Circle and Graphic-Element is considered to be strongly relevant. Classes Point, Line and Rectangle are substantially relevant classes and they are considered to be strongly is-a-kind-of Graphic-Element. Therefore, according to table 5, the inheritance relations between these classes and Graphic-Element are substantially relevant. Classes Square, Reference-Point, Diagonal-Line, Center, Radius are slightly relevant classes and they are considered to be strongly is-a-kind-of Graphic-Element. Therefore, the inheritance relations between these classes and Graphic-Element are slightly relevant. All the remaining possible inheritance relations are considered to be weakly relevant. Using the sub-rules in section 5.2, rule R(7) concludes that the previously identified inheritance relations have a low complexity. The class and attribute relevance values of the entities in the requirement specification are shown by the following table: Entities Class, Relevance: Attribute, Relevance:

GraphicsApplication Weakly

Tool Weakly

Weakly

Weakly

GraphicElement Strongly

Point

Line

Rectangle

Substantially

Substantially

Substantially

Weakly, Slightly

Weakly, Slightly

Weakly, Slightly

Weakly, Slightly

Entities:

Circle

Square

Coordinate

DiagonalLine Slightly

Center

Radius

Color

Slightly

ReferencePoint Slightly

Class, Relevance: Attribute, Relevance:

Strongly

Slightly

Slightly

Slightly

Weakly

Weakly, Slightly

Weakly, Slightly

Slightly, Substantially

Weakly, Slightly

Weakly, Slightly

Weakly, Slightly

Weakly, Slightly

Strongly

Table 6. Class and attribute relevance values of the entities in the requirement specification.

30

The result of the application of fuzzy rules R(5) and R(6) in identifying aggregation and inheritance relations is shown in table 7: Aggregated Aggregation, Relevance: Graphic-Element

Point Substantially

Circle

Substantially

Rectangle Line Square

Slightly

Reference-Point

Slightly

Superclass Line

Graphic-Element

Weakly

Inheritance, Relevance: Point

Substantially

Circle

Strongly

Substantially

Substantially

Rectangle

Substantially

Substantially

-

Line

Substantially

Slightly

Square

Slightly

-

Reference-Point

Slightly

Substantially

Diagonal-Line

Slightly

-

Diagonal-Line

Slightly

Center

Slightly

-

Center

Slightly

Radius

Slightly

-

Radius

Slightly

Aggregator

Subclass

Table 7. Relevance of aggregation and inheritance relations between the classes.

5.5 The Contextual Bias Problem in Fuzzy-Logic Based Methods In section 3, we have pointed out that the contextual factors can influence the reliability of the results of the rules in two ways: by affecting the inputs of the rules and by compromising the validity of the rules. In our fuzzy-logic based approach, the first effect is reduced by increasing the number of quantization levels. Consider rule Candidate Class Identification. Selection of an entity as a candidate class is based on the software engineer’s perception of relevance. This perception can be different from software engineer to software engineer. In two-valued logic based methods, a little difference in perception can cause contradictory results. For example, assume that the same entity in a requirement specification is considered differently by two software engineers, one as slightly and the other as substantially relevant. In case of a two level quantization process, it is likely that the first software engineer would reject and the second one would accept the entity as a candidate class. By increasing the number of quantization levels, the difference between the input values caused by contextual factors is not amplified. In case of 5 quantization levels as shown in Figure 5, the difference in perception of relevance corresponds to the difference between the quantization levels A1 and A3 of the input values Slightly and Substantially, which is 0.5. For 2 levels, however, the difference would be 1. The effect of contextual factors on the validity of a rule can be reduced by modeling the influence of the context explicitly. The validity of a rule is determined by the validity of its conditions. For instance, let us consider rule Inheritance Modification as defined in section 2.5.1: IN THE CLASS HIERARCHY,

IF THE NUMBER OF IMMEDIATE SUBCLASSES SUBORDINATED TO A CLASS IS

LARGER THAN 5, THEN THE INHERITANCE HIERARCHY IS COMPLEX.

The condition of this rule may not be valid for certain kinds of applications. For example, in graphics applications, it appears natural that many classes inherit directly from class Point. 31

Our solution to this problem is to adapt the meaning of linguistic values based on the contextual factors. Consider the fuzzy logic rule Inheritance Modification: P1 ← [Class, id1, (ImmediateSubclasses: V1 ∈ {Low, Medium, High})] ⇒f P2 ← [Inheritance, id2, (Complexity: V2 ∈ {Low, Medium, High})]

The validity of this rule depends on the meanings associated with linguistic values Low, Medium and High. In Figure 7, the fuzzy sets associated with linguistic values Low, Medium and High have been given. Different contexts may associate different meanings with a linguistic value. For instance, in case of a graphics application, the membership functions associated with the linguistic values should be adapted so that higher values of number of immediate subclasses could be acceptable. A membership function can be adapted by translating, compressing and dilating. Translation operation is used to shift the membership function along the Y axis. Figure 10 shows a linear dilation function. The compression or dilation function has to be related to the contextual factors. In general, it is difficult to formalize this relation by analytical functions and therefore heuristic rules have to be adopted. Since rules defining the effect of contextual factors are typically expressed in terms of linguistic expressions, fuzzy logic seems to be appropriate for implementing these rules. For instance, let us consider to use a linear dilation to adapt the meaning of linguistic values Low, Medium and High for property Number of Immediate Subclasses. The relation between the type of application and the degree of dilation may be expressed by the following contextual rule. P1 ← [GraphicProcessingApplication, id1, (Certainty: V1 ∈ {Doubtfully, Approximately, Certain})] ⇒f P2 ← [Dilation, id2, (Degree: V2 ∈ {Low, Medium, High})]

The sub-rules are: P1 ← [GraphicProcessingApplication, id11, (Certainty: Doubtfully)] ⇒f P2 ← [Dilation, id21, (Degree: Low)] P1 ← [GraphicProcessingApplication, id12, (Certainty: Approximately)] ⇒f P2 ← [Dilation, id22, (Degree: Medium)] P1 ← [GraphicProcessingApplication, id13, (Certainty: Certain)] ⇒f P2 ← [Dilation, id23, (Degree: High)]

Depending on the type of application, the contextual rules determine a value for linguistic variable Dilation. By defuzzifying this value, the dilation factor can be obtained. Figure 10 shows the dilation process with a value of 2. Number of Subclasses

High

Number of Subclasses

20 2 x dilation 10

Medium

30

Low

LowMedium High

40

Samples

40 30 20 10 Samples

Fig. 10. Adapting to context through dilating the membership functions.

32

6. Evaluation of the Fuzzy-Logic Based Approach 6.1 Comparison of the Two-Valued Logic and Fuzzy-Logic Based Methods In the following, we will compare the effect of the two-valued and fuzzy-logic based rules from two perspectives: the resulting object models and the reusability and adaptability capability of these models. 6.1.1 The Resulting Object Models The object model created by the application of the two-valued logic based method, depicted in Figure 4 contains only the accepted object-oriented artifacts. The object-model developed by the application of the fuzzy-logic based method, shown in tables 6 and 7 contains object-oriented artifacts and their property values expressed as linguistic values. These linguistic values can be converted to crisp values by defuzzifying them. For example, the application of the center of area defuzzification strategy results in the crisp values 0.07, 0.19, 0.25, 0.5, 0.93 for linguistic values Weakly, composition of Weakly and Slightly, Slightly, composition of Slightly and Substantially, and Strongly, respectively. If needed, to limit the size of the object model, artifacts with crisp values less than a certain threshold value can be eliminated. It is, however, preferable to defer the defuzzification process as much as possible, because in case of fuzzy values, the subsequent design rules can be applied to artifacts without loosing much information. The resulting object models shown in Figure 4 and tables 6 and 7 are quite similar. The accepted artifacts in Figure 4 appear in tables 6 and 7 as artifacts with a high degree of property values. Unlikely in the case of the two-valued based method, however, in fuzzylogic based method, a name generally refers to multiple artifacts. For example, Coordinate is a slightly relevant class but also an attribute with a relevance value determined by the composition of linguistic values Slightly and Substantially. 6.1.2 Adaptability and Reusability of the Object Models To evaluate the adaptability and reusability capabilities of the two-valued and fuzzy logic based object models, in the following we will continue with the analysis process by applying new fuzzy rules R(8) and R(9). These rules consider the object-oriented concepts classes, attributes and inheritance relations from the perspective of the operations of objects. These rules, therefore, can only be applied after the identification of the operations of objects. Similar rules could be also defined for the two-valued logic based method. However, their effect would have been only limited to the selected artifacts. R(8)

Class Cohesion: IF OPERATIONS BELONG TO A CLASS P MEMBERSHIP VALUE THEN SELECT P AS A RELEVANCE VALUE RELEVANT CLASS AND AS A RELEVANCE VALUE RELEVANT ATTRIBUTE.

P← P← P←

[Class, id1, (OperationCohesion: V1 ∈ {Weakly, Slightly, Fairly, Substantially, Strongly})] ⇒f [Class, id2, (Relevance: V2 ∈ {Weakly, Slightly, Fairly, Substantially, Strongly})] ∧ f [Attribute, id3, (Relevance: V3 ∈ {Weakly, Slightly, Fairly, Substantially, Strongly})]

33

Tables 8 and 9 show the sub-rules of rule R(8) respectively when P indicates a class and an attribute: P ← Class, OperationCohesion: P ← Class, Relevance:

Weakly

Slightly

Fairly

Substantially

Strongly

Weakly

Slightly

Fairly

Substantially

Strongly

Table 8. Relation between operation cohesion and class relevance. P ← Class, OperationCohesion: P ← Attribute, Relevance:

Weakly

Slightly

Fairly

Substantially

Strongly

Strongly

Substantially

Fairly

Slightly

Weakly

Table 9. Relation between operation cohesion and attribute relevance. R(9)

Inheritance and Operation Reuse: IF CLASS P1 IS RELEVANCE VALUE RELEVANT AND CLASS P2 IS RELEVANCE VALUE RELEVANT AND THE OPERATIONS DEFINED IN P1 ARE DEGREE SUBSET OF THE OPERATIONS DEFINED IN P2 THEN INHERITANCE BETWEEN P1 AND P2 IS RELEVANCE VALUE RELEVANT.

P1 ← [Class, id1, (Relevance: V1 ∈ {Weakly, Slightly, Fairly, Substantially, Strongly})] ∧ f P2 ← [Class, id2, (Relevance: V2 ∈ {Weakly, Slightly, Fairly, Substantially, Strongly})] ∧ f P3 ← [OperationSubset, id3, (SubSet: id1 ∈ Class), (SuperSet: id2 ∈ Class), (Degree: V3 ∈ {Roughly, Partially, Fully})] ⇒f P4 ← [Inheritance, id4, (SuperClass: id1 ∈ Class), (SubClass: id2 ∈ Class), (Relevance: V4 ∈ {Weakly, Slightly, Fairly, Substantially, Strongly})]

This rule requires a three dimensional representation because it has three input variables. Table 10 shows the sub-rules when [Class, id2, (Relevance: Substantially)]. P1 ← Class, Relevance: P4 ← Inheritance, Relevance: Roughly Partially Fully

Weakly

Slightly

Fairly

Substantially

Strongly

Weakly Weakly Weakly

Weakly Weakly Slightly

Weakly Slightly Fairly

Weakly Slightly Fairly

Weakly Fairly Substantially

P3 ← Operation Subset, Degree:

Table 10. Relation between the input variables and the result of R(9).

In the following, we consider the effect of rules R(8) and R(9) to the results generated by rules R(1) to R(7). As illustrated in section 2.5.3, in the two-valued logic based method, possible changes could be observed as reincarnation and/or conversion of artifacts. In the fuzzy-logic based method, however, changes appear as revaluation or devaluation of the property values. We observe these changes especially for Square, Color and Coordinate, and for the aggregation and inheritance relations among graphical elements.

34

Learning more about Square: During the application of the two-valued logic based method, Square was discarded by rule Redundant Class Elimination because Rectangle was considered more descriptive than Square. In the fuzzy-logic based method, Square was accepted as a slightly relevant class and also as an attribute with a relevance value determined by the composition of Weakly and Slightly. Now assume that during the operation identification phase, we identify a set of operations which can be associated with Square with the cohesion value Strongly. After the application of rule R(8) as illustrated by tables 8 and 9, Square becomes a substantially relevant class and a weakly relevant attribute. This result must be combined with the previous one obtained by rule Candidate Class Attribute Conversion. As the input values of rule R(8) appear less subjective than the ones of rule Candidate Class Attribute Conversion, we weight the result of rule R(8) twice as the result of rule R(4). Figures 11(a) and 11(b) show the relevance of Square as a class after applying rule R(4) and R(8), respectively. It is clear that the relevance value of Square is revaluated as a class after the application of rule R(8). The crisp value obtained by defuzzifying the fuzzy set in Figure 11(b) by the center of area strategy is 0.58 whereas the defuzzified value for Figure 11(a) is 0.25.

Fig. 11. Changes to the relevance value of Square as a class: (a) after applying R(4) and (b) R(8).

Learning more about Color: During the application of the two-valued logic based method, Color was considered as an attribute and was discarded from being a class. In the fuzzy-logic based method, Color was accepted as a weakly relevant class by rule R(4) and as a strongly relevant attribute by rules R(3) and R(4). Rules R(3) and R(4) have equal weighting factors. Now let us assume that during the operations identification phase we realize that a number of color processing operations are needed. These operations are associated with Color with cohesion value Strongly, and using tables 8 and 9, Color is now concluded to be a substantially relevant class and a weakly relevant attribute. The result obtained by rule R(8) has to be combined with the results obtained by the previous rules. We assume that the weighting factor associated with rule R(8) is 1. Figures 12(a) and 12(b) show the grade of relevance of Color as a class after the application of rules R(4) and R(8), respectively. Similarly, Figure 13 shows the grade of

35

relevance of Color as an attribute. If we defuzzify the relevance values of Color, we obtain 0.6 relevance as a class and 0.36 relevance as an attribute. Color is revaluated as a class but devaluated as an attribute.

Fig. 12. Grade of relevance of Color as a class: (a) after applying R(4) and (b) R(8).

Fig. 13. Grade of relevance of Color as an attribute: (a) after applying R(4) and (b) R(8).

Learning more about Coordinate: In the two-valued logic based method, Coordinate was selected as an attribute of class Point. In the fuzzy-logic based method, however, the application of rules R(3) and R(4) made Coordinate a slightly relevant class and an attribute with a relevance value determined by the composition of linguistic values Slightly and Substantially. The application of rule R(8) effects the grade of relevance of Coordinate. Now assume that we identify a set of operations to process coordinate values. These operations, however, are associated with the graphical elements but not with Coordinate. Therefore, the cohesion value of Coordinate is Weakly. As a result of rule R(8), Coordinate is now qualified as a weakly relevant class and a strongly relevant attribute. Figures 14(a) and 14(b) show the grades of relevance of Coordinate as a class and an attribute, respectively. Assuming that the weighting factors are 0.5 for rules R(3) and R(4), and 1 for rule R(8), then the defuzzified values of grade of relevance as a class and attribute are 0.16 and 0.64, respectively. Clearly, Coordinate is now devaluated as a class but revaluated as an attribute.

36

Fig. 14. Grade of relevance of Coordinate: (a) as a class and (b) as an attribute.

Learning more about relations: In the two-valued logic based method, Point and Line were considered parts of Rectangle. Possible inheritance relations between Rectangle, and Point and Line were discarded. In case of the fuzzy-logic based method, the aggregation relation between Rectangle, and Point and Line was considered to be substantially relevant. Further, the inheritance relation between Rectangle, and Point and Line was considered to be weakly relevant. By the fuzzy-logic based method, Point, Line and Rectangle were qualified as substantially relevant classes. Now assume that, during the operation identification phase, we realize that the operations defined for classes Point and Line are fully a subset of the operations defined for Rectangle. After the application of rule R(9), using table 10, the inheritance relation between Rectangle, and Point and Line is now revaluated to substantially relevant. Since, there is now more information available, we associate with this rule 2 as a weighting factor. The grade of relevance for the aggregation and inheritance relations are presented in Figure 15. The defuzzified relevance values for aggregation and inheritance relations are 0.75 and 0.67, respectively. After the application of rules R(8) and R(9), the following artifacts have a defuzzified value of grade of relevance as a class higher than 0.5: Graphic-Element, Point, Line, Rectangle, Circle, Square and Color. Concerning relations among classes, either Rectangle aggregates Point and Line with a defuzzified value of grade of relevance 0.75, or Rectangle inherits from Line and Point with a defuzzified grade of relevance 0.67. Since these values are quite close to each other a decision must be made. If conceptual modeling is considered important, then the aggregation relation can be selected. If reusability is the main concern, then the inheritance relation can be the choice. However, such a binary choice can be deferred to the implementation phase.

37

Fig. 15. Grade of relevance of relations between Rectangle, Point and Line: (a) Rectangle aggregates Point and Line and (b) Rectangle inherits from Point and Line.

6.2 Evaluation with Respect to the Requirements In this section we evaluate our approach with respect to the requirements presented in section 4. • Increase the number of quantization levels: In section 5.2, we proposed a fuzzy-logic based approach to increase the number of levels in rules. In sections 2.3 and 5.3, we calculated the root mean square value of the quantization error for rule Candidate Class Identification. The error was calculated as 0.28 and 0.114 for the two-valued logic and fuzzy-logic based methods, respectively. It is clear that fuzzy-logic based rules cause less quantization error than two-valued logic based rules. In addition, in the fuzzy-logic based method, the accumulation of the quantization error during the software development process is much less than the accumulation of error in the two-valued logic based method. The advantage of having more quantization levels was also practically illustrated during the analysis process of the graphics application in section 5.4.2. Several entities such as Square, Color and Coordinate, and relations such as aggregation and inheritance, were expressed and processed using intermediate levels. This caused in the end a more dependable object model because grade of acceptance of properties of concepts could be expressed and preserved through various phases. In the two-valued logic based method, not being able to express intermediate levels caused the elimination of some artifacts. As discussed in section 2.5.3, in this case, to improve the object model, several iterations would be necessary to reincarnate and convert artifacts. Intermediate levels can be expressed by using linguistic values such as Weakly, Sligthly, Fairly, Substantially and Strongly. In our experimental CASE environment, for most rules, the software engineer is asked to provide information about the concepts being reasoned. For example, for rule Candidate Class Identification, the software engineer is asked to provide relevance and autonomy values of the entity being selected. The linguistic values together with their meanings (membership functions), such as the one in Figure 5, are displayed by the tool. For some rules, such as rule R(7) defined in section 2.5.1, the software engineer can input crisp values as well. Being able to reason with linguistic values provides a very natural tool interface.

38

• Minimize early elimination of concepts: In the fuzzy-logic based method, in principle, none of the concepts is eliminated. During the application of the first four fuzzy rules, for instance, Square became a strongly relevant candidate class, a slightly nonredundant class, a slightly relevant class and an attribute whose relevance was determined by the composition of Weakly and Slightly. In the two-valued logic based method, however, Square was eliminated and therefore was not included in the final object model shown in Figure 4. The fuzzy-logic based method can be considered as a learning process; a new aspect of the problem being considered is learned after the application of each rule. Obviously, a new aspect can modify the previously gathered property values. The fuzzy-logic theory provides techniques to reason and compose the results of the rules. For example, as shown by Figure 11, after applying rule R(8), Square was revaluated as a class although first it was considered as slightly relevant class. Similarly, during the analysis process, as shown by Figures 12 and 13, Color was revaluated as a class but devaluated as an attribute. Further, Figure 14 illustrates how Coordinate was revaluated as an attribute. Finally, Figure 15 shows the devaluation of aggregation and the revaluation of inheritance relations. Clearly, as illustrated by section 6.1.2, software development through learning, as exemplified by the fuzzy-logic based method, creates a very adaptable and reusable design models. • Explicit modeling of the influence of context: As illustrated in section 5.5, the fuzzylogic based method is suitable for tuning rules with respect to contextual information. The output of the contextual rules can modify the universe of the linguistic variables used in rules of a method. Defining contextual rules, however, may still be a difficult task. We therefore think that fuzzy-logic based rules can be best suitable for expressing these rules because the influence of contextual parameters cannot be always expressed in a deterministic way. • Quality versus cost: Increasing the number of quantization levels improves the quality of the software development process but also increases the cost of the CASE environment. In addition, since none of the concepts are eliminated in the fuzzy-logic based method, all possible concepts must be stored. However, most concepts will be likely to have the lowest grade of property values, such as Weakly. This makes it possible to minimize storage requirements by registering only the concepts that have a grade of property values, for example, higher than Weakly. Similar to designing digital signal processing systems, the fuzzy-logic based method provides a unique opportunity to tune the quality of CASE environments with respect to the memory and processing costs. 7. Related Work In the following, we briefly present the application of fuzzy-logic based reasoning to various areas. Only a very few publications combined fuzzy-logic techniques with objectoriented concepts.

39

Applications of fuzzy-logic based reasoning Application of fuzzy-logic to various areas is becoming more and more common. There have been, for example, fuzzy databases which manage fuzzy information [24], fuzzy pattern recognition systems which recognize audio and visual signals [14], fuzzy decision support systems which are based on uncertain information [18], fuzzy control systems which use heuristic rules to control complex structures [19]. To the best of our knowledge, fuzzy-logic based reasoning was not applied to object-oriented methods before. Fuzzy object-oriented models In [12], the notion of fuzzy objects has been introduced. Fuzzy objects can have attributes that contain fuzzy sets as values and there may be a partial inheritance relation between classes. For example, class ToyVehicle can be defined to inherit the property Cost with a grade of 0.9 from class Toy and with a grade of 0.3 from class Vehicle. In class ToyVehicle the property Cost is initialized to Low, whereas in class Vehicle it is initialized to High. By using the fuzzy union of the fuzzy sets determined by Low and High, the value of property Cost of class ToyCar can be obtained. This is an interesting approach to object modeling, but it does not propose fuzzy-logic based techniques for object-oriented software development. On the contrary, our approach focuses on the application of fuzzy-logic based reasoning to object-oriented methods for reducing the quantization error, and enhancing adaptability and reusability of design models. In [24], fuzzy objects are defined to represent complex objects with uncertainty in the attribute values. Further, the operator merge is introduced to combine two objects into a single object, provided that the predefined value levels are achieved. In relational databases, such a merge operator makes it possible to retrieve data based on uncertain information. CASE environments In [1], fuzzy logic is applied to evaluate object-oriented CASE tools. Here, the ISO software product quality evaluation process model has been extended using fuzzy-logic based rules. Recently, there has been a considerable amount of interest in formalizing software processes. Among many experimental systems, Merlin [17] and MARVEL [13] are related to our approach since they are based on a rule-based expert system. These systems, however, adopt two-valued logic based reasoning. In addition, our emphasis is not to model software processes, but methodological rules. The interdependencies among rules as shown by Figures 3 and 8 are not fundamental to our work. Our approach can be integrated with any of these process programming languages. However, the interdependencies among the rules have some effect to the propagation of errors and therefore must be taken into account when cumulative errors are to be computed. 8. Future Work In addition to reducing the quantization error and contextual bias problems, the fuzzy-logic based approach opens a set of interesting perspectives to software development. In the following, we will present some of these perspectives and briefly indicate our related research.

40

• Process and artifact metrics: The fuzzy-logic based method provides a mathematical means to evaluate each process and artifact; the root mean square value of the quantization error of a given process can be considered as a process metrics and the defuzzified value of a property can be seen as artifact metrics. For example, after applying rule R(8), the metrics value of Square as a class can be considered as 0.58. We are currently investigating the applicability of the fuzzy-logic techniques as process and artifact metrics. Particularly, we would like to understand the precise relation between process and artifact metrics. We are also fuzzifying some popular object-oriented methods by using fuzzy-logic based rules and implementing them in our experimental CASE environment. • Accumulative software development: Most current life-cycle models, such as the waterfall or the spiral models [16] are inherently based on two-valued logic. Adopting fuzzylogic reasoning brings new perspectives to modeling software life-cycle. Basically, a fuzzy-logic based method implements an accumulative learning process; after each process, a new aspect of the software being developed can be learned. This naturally results in a very adaptable and reusable design model. We believe that fuzzy-logic based methods can shift today’s emphasis of producing reusable code to creating reusable and adaptable design models. • Design documentation: Each concept in the fuzzy-logic based method has a set of property-value pairs, which can be modified through the application of new rules implemented as fuzzy-logic operations. These operations can be stored as a history information. Fuzzy-logic based object models, therefore, naturally document the complete software development history; applied rules, the assumptions made by the software engineers, the contextual information, they all can be fully traced. More importantly, this way of documenting design information is fully integrated with the object model, since the concepts that constitute the object model are created through the application of these rules. We are currently designing an atomic transaction framework by using a fuzzy-logic based method. We are experimenting with the adaptability of the design model. In addition, we will be carrying out assessment studies how this self documenting design expertise can be utilized in a commercial organization5. • Active methods: While developing software systems, decisions made about the validity of certain artifacts may change. For example, while analyzing the graphics application, the relevance value of Square as a class revaluated considerably. Similarly, lack of modeling techniques and methods for dedicated applications may force the designer to take decisions towards a certain direction. Assumptions made on processing speed and heuristics on dynamic behavior of a system may not remain valid. Changing the property value of an artifact may occur during the software development or even during the operation phase. Since concepts are largely dependent to each other, changing the property values of a certain artifact may influence other related artifacts. For large software systems, it can be very tedious to monitor and validate all the dependencies manually.

5

This project is carried out together with Siemens-Nixdorf-Vianen under the Senter program which is partially supported by the Dutch Ministry of Economical affairs.

41

Fuzzy-logic based methods can maintain knowledge about the process and context of the software development activity so that they can assist software engineers specifically. Consider, for example, fuzzy rule Inheritance Identification as shown in section 5.4. Here the relevance of an inheritance relation depends on two factors: the relevance values of the classes being considered and the quality of the is-a-kind-of relation. For example, if the classes have a strongly relevant is-a-kind-of relation but they are qualified as weakly relevant classes, then according to Table 5, the inheritance relation between them will be qualified as weakly relevant. Now assume that, eventually after the application of additional rules, these classes are revaluated as substantially relevant classes. Then the inheritance relation will correspondingly revaluate to a substantially relevant relation. This has been in fact experienced for Square. Since after the application of rule R(8) Square was revaluated as a class, its inheritance relation from Line and Point revaluated as well. The fuzzy-logic based method automatically updates the related linguistic values because it is implemented as a fuzzy-logic reasoning system. Obviously, here cyclic dependencies must be avoided. We are currently experimenting with the usefulness of active methods. • Experimental CASE environment: Our experimental CASE environment [7] is based on a fuzzy-logic reasoning framework [3] and has been implemented using the Smalltalk language [11]. The framework consists of 59 classes. For the method engineer, the CASE environment provides tools for defining linguistic variables, values and fuzzy rules. The software engineer basically interact with tools that represent design rules. In addition, various browsers have been implemented for inspecting the grade of acceptance of concepts. The figures used in this paper that indicate the grade of relevance of concepts were generated by the CASE environment. 9. Conclusions This paper identifies the concept of quantization error and contextual bias problems that one may experience during software development. These problems have been illustrated first mathematically and then experimentally by the help of a simple example. To minimize these problems, fuzzy-logic based techniques have been proposed. It has been shown formally and experimentally that fuzzy-logic based reasoning can reduce quantization errors and can be useful in adapting design rules with respect to changing contexts. In addition, the application of fuzzy-logic based reasoning opens new perspectives to software development, such as new process and artifact metrics, accumulative software life-cycle, integrated design documentation and active methods. The proposed fuzzy-logic based method has been implemented using our experimental CASE environment. Acknowledgments This work was carried out when the second author visited the University of Twente from November 1994 to October 1995. We highly appreciate the corrections made by Pim van den Broek on the formulation of the quantization error. We also thank Klaas van den Berg for his comments on an earlier version of this paper.

42

References [1] K. Agavanakis, T. Antonakopoulus and V. Makios, “On Applying Fuzzy Sets in the Evaluation Process of Object-Oriented Supporting CASE Tool”, Proc. of EUROMICRO 95, pp. 564-57, Como, September 1995. [2] M. Aksit and A. Tripathi, “Data Abstraction Mechanisms in Sina/ST”, Proc. of the OOPSLA ’88 Conference, ACM SIGPLAN Notices, Vol. 23, No. 11, pp. 265-275, 1988. [3] M. Aksit. F. Marcelloni, B. Tekinerdogan, C. Vuijst and L. Bergmans, “Designing Software Architectures as a Composition of Specializations of Knowledge Domains”, Memoranda Informatica, 95-44, Un. of Twente, Enschede, The Netherlands, December 1995. [4] V.C. Basili and H.D. Rombach, “The TAME Project: Towards Improvements-Oriented Software Environments”, IEEE Transactions on Software Engineering, Vol. 14, No. 6, pp. 758-772, June 1988. [5] G. Booch, Object-Oriented Design with Applications, The Benjamin/Cummings Publishing Company Inc., 1991. [6] F. Brito e Abreu, “Object-Oriented Software Design Metrics”, Workshop on Pragmatic and Theoretical Directions in Object-Oriented Software Metrics, Addendum to the OOPSLA'94 Proceedings, pp. 78-80, 1995. [7] P. Broekhuizen, FLuent: A fuzzy-Logic User Environment for an Object Oriented Fuzzy Logic Reasoning Framework, M.Sc. Thesis, Un. of Twente, Enschede, The Netherlands, September 1996. [8] S. R. Chidamber and C. F. Kemerer, “A Metrics Suite for Object-Oriented Design”, IEEE Transactions on Software Engineering, Vol. 20, No. 6, pp. 476-492, June 1994. [9] D. Coleman, P. Arnold, S. Bodoff, C. Dollin, H. Gilchrist, F. Hayes and P. Jeremaes, Object-Oriented Development, The Fusion Method, Prentice Hall, 1994. [10] E. Gamma, R. Helm, R. Johnson and J. Vlissides, Design Patterns: Elements of Reusable ObjectOriented Software, Addison-Wesley, 1995. [11] A. Goldberg and D. Robson, Smalltalk-80, The Language, Addison Wesley, 1989. [12] I. Graham, Object-Oriented Methods, 2nd. Edition, Addison-Wesley, 1994. [13] G.T. Heineman, G.E. Keiser, N. S. Barghouti and B. Shaul, “Rule-Chaining in Marvel: Dynamic Binding of Parameters”, IEEE Expert, Vol. 7, No. 6, pp. 26-82, December 1992. [14] D.L. Hudson, M.E. Cohen and P.C. Deedwania, “Emerge - An Expert System for Chest Pain Analysis”, Approximate Reasoning in Expert Systems, M.M. Gupta, A. Kandel, W. Bandler and J.B. Kiszka (eds), pp. 705-717, Elsevier Science Publishers B.V., North-Holland, 1985. [15] I. Jacobson, M. Christerson, P. Jonsson and G. Overgaard, Object-Oriented Software Engineering -- A Use Case Driven Approach, Addison-Wesley/ACM Press, 1992. [16] G. W. Jones, Software Engineering, John Wiley and Sons, 1990. [17] G. Junkermann, B. Peuschel, W. Schafer and S. Wolf, “MERLIN: Supporting Cooperation in Software Development Through a Knowledge-Based Environment”, Software Processing and Technology, Finkelstein et al. (eds), pp. 103-109, John Wiley and Sons, New York, 1994. [18] J. Kacprzyk, S. Zadrozny and M. Fedrizzi, “An Interactive User-Friendly Decision Support System for Consensus Reaching Based on Fuzzy Logic with Linguistic Quantifiers”, Fuzzy Computing, M.M. Gupta and T. Yamakawa (eds), pp. 307-322, Elsevier Science Publishers B.V., North-Holland, 1988. [19] C.C. Lee, “Fuzzy Logic in Control Systems: Fuzzy Logic Controller, Part II”, IEEE Transactions on Systems, Man, and Cybernetics, Vol. 20, No. 2, pp. 419-435, March/April, 1990. [20] A. Papoulis, Probability, Random Variables, and Stochastic Processes, McGraw-Hill, 1991 [21] J. Rumbaugh, M. Blaha, W. Premerlani, F. Eddy and W. Lorensen, Object-Oriented Modeling and Design, Prentice-Hall, 1991. [22] M. Schwartz, Information Transmission, Modulation, and Noise, 4th ed., McGraw-Hill, 1990.

43

[23] D. Ungar and R. B. Smith, “Self: the Power of Simplicity”, Proc. of the OOPSLA ’87 Conference, ACM SIGPLAN Notices, Vol. 22, No. 12, pp. 227-242, 1987. [24] A. Yazici, R. George, B.P. Buckles and F.E. Petry, “A survey of conceptual and logical data models for uncertainty”, Fuzzy Logic for the Management of Uncertainty, L.A. Zadeh and J. Kacprzyk (eds), pp. 281-295, John Wiley & Sons, Inc., 1992. [25] L.A. Zadeh, “Outline of a New Approach to the Analysis of Complex Systems and Decision Processes”, IEEE Transactions on Systems, Man, and Cybernetics, Vol. SMC-3, No.1, pp. 28-44, January, 1973.

Appendix Fuzzy-Logic Based Reasoning In fuzzy-logic, the concept of vagueness is introduced by the definition of fuzzy set. A fuzzy set S of a universe of discourse U is characterized by a membership function µ S :U ⇒ [ 0,1] which associates with each element y of U a number µ S ( y) in the interval [0,1] which represents the grade of membership of y in S [25]. Given two fuzzy sets A and B in a universe of discourse U, some basic operations are the following:







¬A = (1 −µ A ( y )) / y ; A ∩ B = T (µ A ( y ), µ B ( y )) / y ; A ∪ B = T ∗ (µ A ( y ), µ B ( y )) / y U

U

where the integral sign

U

∫ µ ( y) / y

stands for the union of the points y at which µ ( y ) is

U

positive and T and T ∗ identify a triangular norm and conorm, respectively. A triangular norm is a function T from [0,1] × [0,1] to [0,1] such that the properties of symmetry, associativity and monotonicity, and for a ∈ [ 0 ,1 ] the boundary condition T(a,1) = a holds. Unlike a triangular norm, the boundary condition for a triangular conorm is T*(a,0) = a. Based on the definition of fuzzy set, the concept of linguistic variables is introduced to represent languages typically adopted by experts. A linguistic variable is a variable whose values, called linguistic values, have the form of phrases or sentences in a natural language [25]. For instance, linguistic variable height might assume the values tall, very tall, more or less tall, etc. In contrast with the two-valued logic, for instance in the rule IF X IS A THEN Y IS B, the predicates X and Y may return values between 0 and 1 rather than Boolean values. In addition, the implication operator is a fuzzy relation rather than a connective defined by a truth table. A fuzzy relation R, from a set X to a set Y, is defined as a fuzzy set of the Cartesian product X × Y . R is characterized by a membership function µ R ( x, y) and is expressed by R =

∫µ

R ( x, y) / ( x, y)

. Based on this definition a large amount of fuzzy

X ×Y

implication operators have been proposed and their properties have been studied. In particular, their behavior on generalized modus ponens has widely been analyzed. The generalized modus ponens is the most used mechanism of fuzzy reasoning. In its most general form, this extension of the modus ponens may be expressed as: IF

X IS A THEN Y IS B X IS A’

-----------------------------Y IS B’

where the consequent B’ is defined as B’(v ) = sup u∈U T ( A’(u), A(u) ⇒ f B(v)) , with T a triangular norm and ⇒ f a fuzzy implication operator. The sup u∈U T operator is called 44

compositional operator and the generalized modus ponens is also named compositional rule of inference [25]. Notice that the generalized modus ponens allows to infer conclusions also if the facts, in this case X IS A’, match only approximately the antecedent part of the rule. A conclusion is expressed as a linguistic value. If we are interested in a crisp value, we must defuzzify the conclusion by a defuzzification strategy. A defuzzification strategy is aimed at producing the crisp value which best represents the linguistic value. At present, the commonly used strategies may be described as the mean of maxima and the center of area. The crisp value produced by the mean of maxima strategy represents the mean value of the elements which belong to the fuzzy set characterizing the conclusion with maximum grade. The center of area strategy produces the center of gravity of the fuzzy set characterizing the conclusion.

45

Suggest Documents