Document not found! Please try again

assessment of classical search techniques for ... - CiteSeerX

14 downloads 0 Views 148KB Size Report
BRITISH MUSEUM WITH PRUNE. The exhaustive search technique used until now in the FIR methodology corresponds to a classical British. Museum strategy ...
ASSESSMENT OF CLASSICAL SEARCH TECHNIQUES FOR IDENTIFICATION OF FUZZY MODELS Angela Nebot

Antoni Jerez

Dept. Llenguatges i Sistemes Informatics Universitat Politecnica de Catalunya, Modul C6 - Campus Nord Jordi Girona Salgado, 1-3, Barcelona 08034, Spain. Phone: (343) 4015642 Fax: (343) 4017014 [email protected] [email protected]

Keywords: Fuzzy systems; Inductive reasoning; Search techniques; Model identi cation.

ABSTRACT: Fuzzy Inductive Reasoning (FIR), a methodology based on the General System Problem Solver (GSPS), is a tool for identi cation of qualitative models of dynamical systems as well as for prediction of their future behavior from past behavior. The identi cation of qualitative models in FIR is a complex search problem of behavior analysis. The search space grows exponentially with the number of variables (m{inputs). In this paper, di erent classical techniques (Hill climbing, Branch and Bound and Beam searches with dynamic programming) have been implemented in order to assess their performance in the search problem at hand. A comparative study of these techniques when applied to qualitative model identi cation of real systems in Biomedical and Biology elds is also presented in the paper. INTRODUCTION

The FIR methodology is based on the General System Problem Solver (GSPS) (Klir, 85), a tool for general system analysis that allows to study the conceptual modes of behavior of dynamical systems. FIR is a data driven methodology based on systems behavior rather than structural knowledge. It has two main tasks. The rst is to identify qualitative causal relations between the variables that compose the system, that is, the obtaining of qualitative models. The second is to infer future behavior of the system from past behavior, that is, the prediction of future output states. FIR is a powerful technique, suitable for modeling and simulation systems for which no or only very limited a priori structural knowledge is available, such as in biomedicine, biology, and the economy. System’s Quant. Measurements Values

Fuzzification

Qual.

Qualitative Model Identification Values (Optimal Mask)

Rule Base

Qual.

Quant.

Defuzzification Qualitative Prediction Fore.

Predicted Values Fore.

Figure 1: FIR main processes Figure 1 shows the two main tasks of the FIR methodology in a schematic way, namely Qualitative model identi cation and Qualitative prediction. The fuzzy inductive reasoner is fed with data measured from the system under study (quantitative data) converted into fuzzy information by means of the fuzzi cation function. This

qualitative data that captures the system's input/output behavior is then used to identify the best qualitative model representing the system. This model is called, in the FIR terminology, the optimal mask (Cellier et al., 96). The optimal mask speci es, which among the available potential input variables contain most information about the output variable to be predicted. Once the best model has been identi ed, it can be used together with the fuzzi ed data to obtain a state

transition table, i.e., a qualitative rule base that can then be used by the inference engine to make qualitative predictions of the output. The FIR inference engine is a special implementation of the k-nearest neighbor rule that is commonly used in pattern recognition. Finally, the qualitative prediction is converted back to a quantitative value by means of the defuzzi cation function. This paper is focused on the qualitative model identi cation process. The model that best describes the system under study is obtained by computing a quality measure (a metric) for all possible masks, and then selecting the model with the highest quality. Consequently, an exhaustive search of all possible models is done by means of an algorithm of exponential complexity that is intractable when dealing with complex systems. Therefore, when dealing with real applications, the search space usually needs to be reduced in order to be able to obtain a model within reasonable computational time. A reduction in the search space can be accomplished by forbidding particular causal relations between variables (m{inputs) that take part in the model. This approach is justi ed when previous knowledge of the system under study is available that allows to determine a set of relations that either are not physically meaningful or make no sense logically. However, the technique becomes problematic when the system under study is a complex system for which no additional information is available beside from the measured data. In such a situation, reducing the search space by arbitrarily forbidding some of the possible relations between the variables (m{inputs), is dangerous because this might eliminate the best qualitative model from the search space. Clearly, an np{complete problem can never be solved reliably in polynomial time. Any algorithm of polynomial complexity applied to an np{complete problem bears the risk of missing the true optimum. Therefore, several di erent alternative schemes of suboptimal search algorithms with polynomial complexity should be devised that can be tried one after the other in order to enhance the chance of nding the truly optimal solution of a realistically complex np{complete identi cation problem in polynomial time. The main goal of this paper is to assess classical search techniques for qualitative model identi cation in the fuzzy inductive reasoning methodology. In the next section, the search problem is presented in more detail by introducing the reader to the qualitative model identi cation algorithm that, in the FIR methodology, is called the optimal mask function.

THE SEARCH PROBLEM

As mentioned in the previous section, fuzzy inductive reasoning is a completely inductive approach to modeling. It operates on a set of measured data points and identi es a model from previously made observations. Quantitative values in FIR are represented as qualitative triples, consisting of class, membership and side values for each quantitative element (Cellier et al., 96). The class and membership values carry the same semantics that are used in all fuzzi cation schemes. The side value is a specialty of the FIR methodology. It provides qualitative information on whether the quantitative value to be fuzzi ed lies to the left or to the rigth of the maximum of the membership function that is associated with the selected class. By storing the side value as a third piece of information, the fuzzi cation process preserves total information about the original quantitative data point. An inductive reasoning model is a mask that relates the input variables and previous values of the output variables to the current values of the outputs. An example of a mask (model) is presented below: t nx

u1 u2 u3 y1 1 0 t ? 2t ?1 ?1 0 0 t ? t @ 0 0 ?1 ?1 A t

?1

0

0 +1

(1)

It represent the equation: y1 (t) = ~f(u1 (t ? 2t); u2(t ? 2t); u3(t ? t); y1 (t ? t); u1(t)) where ~f denotes a qualitative relationship. Each negative element in the mask is called an m{input, and the positive element is the m{output. Each m{input denotes a causal relation with the output and in uences it up to a certain degree. The number of non-zero elements is called the complexity of the mask. In the previous example, the complexity of the mask is 6. The mask is used together with the fuzzi ed data measured from the system, to obtain a qualitative rule base from which the mask can be evaluated by means of the computation of its quality, asi it is explained below. The FIR optimal mask function works as follows. It searches through all legal masks of complexity 2 (one m{input and the m{output) and nds the best one by computing the quality of the mask; it then proceeds

by searching through all legal masks of complexity three (two m{inputs and the m{output) and nds the best of those; and it continues in the same manner until the maximum complexity is reached. The mask with the absolute best quality is then chosen to be the optimal one. The search has an exponential complexity, evaluating 2n?1 ? 1 masks before reaching a decision, where n is the maximum possible complexity. In the previous example (equation 1), n is 12 (4 variables  3 time steps), assuming that the maximum allowed depth of the mask is chosen to be 3. In the FIR methodology, the quality of a mask, Qm , is de ned as the product of its uncertainty reduction measure, Hr , and its observation ratio, Or : Qm = Hr  O r

(2)

The normalized entropy reduction Hr is de ned as: Hr = 1:0 ? HHm

max

(3)

where Hm is the Shannon entropy of the mask, and Hmax is the maximum entropy reached when for each combination of input states each possible output state is equally likely. Hr is a real number in the range 0.0 to 1.0, where higher values usually indicate an improved forecasting power. The masks with highest entropy reduction values generate forecasts with the smallest degree of uncertainty. The Shannon entropy, Hm , of the mask is computed as the sum: Hm = ?

X 8i

p(i)  Hi

(4)

where Hi is the Shannon entropy measure relative to a given input state, and p(i) is the probability of that input state to occur. The observation ratio, Or , measures the number of observations for each input state. If every m{input state has been observed at least ve times, Or is equal to 1.0. If no m{input state has been observed at all (no data are available), Or is equal to 0.0. Or is a measure of the mask complexity. The exhaustive search performed in the FIR methodology has the advantage of assuring that the optimal mask is always found. However, its main drawback is its exponential computing time. In the next section, other approaches borrowed from classical search paradigms are proposed with the hope to reduce the computing time, while preserving a high probability of nding either the optimal mask or, at least, a sub-optimal mask with a quality value close to that of the optimal mask.

CLASSICAL SEARCH TECHNIQUES Most search problems can be posed as either constraint-satisfaction or optimization task. Often, the di erence in computational complexity between an optimization task and its constraint-satisfaction counterpart is quite substantial. In such cases, it is necessary to relax the optimality requirement in order to nd a \good" solution within a reasonable amount of search e ort. Whenever the acceptance criterion tolerates a neighborhood about the optimal solution, a semi-optimization problem ensues. The FIR optimal mask search belongs to this group of problems, requiring a reasonable balance between the quality of the solution found, in the present case the quality of the mask, and the cost of searching for it. Therefore, when applying heuristic search procedures for solving the problem of nding a good model by means of the FIR methodology, usually a sub-optimal mask is found instead of the absolute best mask. Is it justi ed to content oneself with a sub-optimal mask? In order to answer this question, it is necessary to pose another question rst. What means \justi ed" in this context? It is quite clear that the aim of obtaining a model of any kind of system is to be able to simulate it afterwards, that is, to predict its future behavior. Therefore, a sub-optimal mask in the context of FIR methodology is justi able if and only if it lends itself to making decent predictions about future system behavior. There are several reasons for answering the previous question armatively:

 There usually exist a number of sub-optimal masks with qualities that are very close to that of the optimal

mask, and di er only little from the quality of the optimal mask. It often happens that, even though the optimal mask has been found using exhaustive search, a sub-optimal mask was nally selected by the researchers as superior due to its simplicity (reduced mask complexity in comparison with the optimal mask), when the quality of both masks are very close one to another.

 The mask is simply a representation of a pattern of variables used to analyze the relations among the

available variables. As mentioned earlier, FIR identi es a model from previously made observations, and therefore, masks are nothing but possible explanations for patterns found in the data available to obtain the model. This model is then used to predict the future behavior of the system when excited with input signals that may have never shown up before, and therefore, it cannot be assured that the best prediction will be obtained when the optimal mask is being used. It could indeed happen, and this is not even uncommon, that a sub-optimal mask is able to predict in more accurate ways the future system behavior, because the training data were not truly representative of all the operational modes that the system can be operated under.

Therefore, the use of heuristic search techniques that will help to identify a good mask (be it optimal or suboptimal) while signi cantly reducing the computation time is indeed strongly justi able in optimal mask search problems in the context of the FIR methodology (Vanwelden and Vansteenkiste, 96). Several heuristic search strategies have been reported in the literature that allow to explore di erent alternatives ((Winston, 84), (Pearl, 84)). Some of them have been selected and implemented in order to study their performance, and to assess their validity for the task at hand. The strategies studied are the following:

BRITISH MUSEUM WITH PRUNE

The exhaustive search technique used until now in the FIR methodology corresponds to a classical British Museum strategy, where all possible masks are investigated to select the best among them. In the British Museum with Pruning strategy, the classical method has been accelerated by including some heuristics. From the quality

measure presented in the previous section, it is intuitively clear that the entropy of the masks decreases with increasing mask complexity, i.e., when one moves down the tree. Therefore, if the observation ratio, Or , is not taken into account, the quality of the mask, Qm , will grow with increasing mask complexity. The observation ratio is a measure of mask complexity that penalize those situations where, although the chosen mask is highly deterministic, it o ers poor predictiveness due to the fact that every state that has ever been observed has been seen only once, and there exist many legal states that have never been encountered at all in the set of training data. Taking this knowledge into account, some heuristic information can be included in the search strategy. The graph will be explored until all children and grandchildren of the current node have a lower quality than that of their parent and grandparent respectively, or until the complexity limit is reached, which ever comes rst. The introduction of this termination criterion reduces the number of explored masks, and thereby reduces the computational time, while not signi cantly reducing the probability of including the optimal mask in the search tree.

HILL-CLIMBING

The Hill-Climbing strategy is based on the concept of local optimization. It is the simplest and most popular search strategy (Vanwelden and Vansteenkiste, 96). In terms of graph-search methods, this strategy amounts to repeatedly expanding a mask, inspecting its newly generated successors, and choosing and expanding only the best among these successors, while retaining no further reference to either the parent or siblings. The search terminates when no further expandsion can be done, that is, when the complexity limit is reached. The major advantage of this strategy is that its computational e ort is polynomial and considerably lower than that of any other heuristic search technique. The disadvantage of Hill-Climbing is that one can easily get stuck in a local maximum, thereby ending up with a mask the quality of which is still far away from that of the optimal mask.

BEAM SEARCH

Beam searching can be viewed as an enhancement of the Hill-Climbing strategy. In this strategy, the k best among the successors are being investigated, where k is any number between 1 (Hill-Climbing) and n (British Museum). Beam searching amounts to repeatedly expanding the best k masks of one level, inspecting its newly

generated successors, and choosing and expanding the k best among them. As long as k is kept constant, the search strategy preserves polynomial complexity. Only if k is chosen as a fraction of n does the method become exponential in complexity. In the dialect of Beam searching implemented in FIR, k is a user-modi able parameter, allowing the user to select the width of the search tree. As Hill-Climbing strategy, Beam search terminates when the complexity limit is reached. This method can reduce substantially the search space in comparison with the British Museum strategy. If k is chosen properly, Beam searching o ers the best of both worlds: a dramatically reduced search space and yet a high probability of nding either the optimal mask or at least a sub-optimal mask of comparable quality. The main disadvantage of Beam searching is that the user needs to select k, a parameter that is related to the search technology rather than to the physics of the modeling problem. Usually, the user has no insight that would allow him or her to estimate the best possible value of this parameter.

BRANCH AND BOUND

The Branch and Bound strategy has the main advantage of preserving incomplete levels for further consideration, thereby reducing the risk of getting stuck in a local maximum. This strategy expands the best mask one level. This new mask is then inspected along with the remaining old ones, and again the best mask is chosen and expanded. The search terminates when the complexity limit is reached. The British Museum and Hill-Climbing search strategies represent two extremes with respect to their computational e orts as well as their ability of nding the optimal mask. While the British Museum strategy is able to always nd the best mask, its search space is of exponential complexity.

 n?1   n?1   n?1  + + + = 2n?1 ? 1 1

2

n?1

(5)

In contrast, Hill-Climbing is the strategy that requires the least computational time due to the small number of masks investigated with this technique:

 n?1   n?2  2 n + + + = ?1 1

1

1

2

(6)

where n is the maximum mask complexity. Clearly, the British Museum strategy is of exponential complexity, whereas Hill-Climbing is of polynomial complexity. In between these two extremes the remaining search techniques presented above are placed. In the next section, some results are presented that allow to assess the validity of the previously listed search techniques in both aspects: computational complexity and quality of the best mask found.

RESULTS

Di erent applications from the soft sciences have been used to study the performance of the previously introduced search techniques. The example presented in thispaper is from the area of the anesthesiology. The main goal is to obtain a control model for the amount of anesthetic agent to be administered to a patient during surgery. This system is composed of three input variables, systolic arterial pressure, heart rate, and respiration rate, as well as one output variable, the amount of anesthetic agent to be delivered to the patient (Nebot et al., 96). Due to the di erence between the slowest and largest time constants of importance, a mask of depth 21 is chosen. Consequently, the maximum possible mask complexity is 84. A search space of 283 ? 1 is obviously intractable, and therefore, it was necessary to reduce the maximum complexity of the mask. This has been accomplished by introducing forbidden temporal relations among the variables, without reducing the mask size that allows the mask to capture the range of important time constants. Two di erent tests were performed reducing the maximum mask complexity by introducing di erent forbidden connections. In the rst experiment, a mask with a maximum of 12 non-zero elements was considered. Thereby, the complete search space is 211 ? 1 = 2047 masks. In the second experiment, a maximum of 16 non-zero entries in the mask were tolerated, which corresponds to a search space of 215 ? 1 = 32767 masks. The results of this tests are presented in table 1. In table 1 several pieces of information related to the search are presented: CPU time (measured in seconds), number of masks explored, best mask found, as well as the quality of the best mask found.

Time n = 12 Num.Masks Best Mask Quality Time n = 16 Num.Masks Best Mask Quality

Brit.Mus. Brit.Mus.Pru. Hill-Clim. Beam Branch-Bound 153 2047 opt. 0.5968 7395 32767 opt. 0.5968

52 459 opt. 0.5968 231 1373 opt. 0.5968

2 38 subopt. 0.5470 2 54 subopt. 0.5243

54 684 subopt. 0.5470 651 5508 subopt. 0.5407

6 71 subopt. 0.5237 13 125 subopt. 0.4141

Table 1: Results of the anesthesiology application with maximum mask complexity of 12 and 16 As can be seen in table 1 although British Museum allways nds the optimal as the best mask, the computational time used turns it in a not very well suited technique when is wanted to deal with complex systems. On the other hand, Hill-Climbing and Branch and Bound techniques are very fast but usually found a suboptimal solution the quality of which is still far away from that of the optimal mask. Intermediate strategies are British Museum with prune and Beam search. This techniques found usually the optimal mask or a suboptimal very close to the optimal one, with respect the quality parameter. The time needed by those techniques are still far away from the one used in exhaustive search, although it is higher than the computational time that HillClimbing and Branch and Bound strategies usually needs. Several Beam search tests were performed for di erent values of the k parameter. The results shown in table 1 are obtained with k = 2. Beam search with k equal to 3 and 5 were able to nd the optimal as the best mask, beeing the computational time 123 and 160 seconds respectively. Other applications from the soft sicences can be found in (Jerez and Nebot, 1997), presented also at the EUFIT'97 conference.

CONCLUSIONS

Classical search techniques appear to be well suited for the identi cation of qualitative models in the FIR methodology. All of the search techniques implemented in this research perform better than the exhaustive (British Museum) search that has traditionally been used in the FIR methodology. The examples reported in this paper demonstrate that those search strategies nd the optimal or a sub-optimal mask with considerably less computational time and memory. However, each of them has its own drawbacks. Whereas Hill-Climbing is the best among these techniques with respect to search time, it is not to be expected that this method will nd the optimal mask or even one close to it in quality at all times. In contrast, it is usually much more likely that the optimal mask or a close contender are found using the Beam search or Branch and Bound strategies. However, their eciency is much reduced in comparison to the Hill-Climbing technique. Yet, even their computational time is still much more favorable than that used by the classical British Museum strategy. The British Museum with Pruning strategy turned out to be a surprisingly good alternative to both Beam search and Branch and Bound.

REFERENCES

Cellier, F.E. and Nebot, A. and Mugica, F. and de Albornoz, A. 1996. Combined Qualitative/Quantitative Simulation Models of Continuous{Time Processes Using Fuzzy Inductive Reasoning Techniques. International Journal of General Systems Vol. 24 Num. 1-2, pp.95 - 116. Jerez, A. and Nebot, A. 1997. Genetic Algorithms vs. Classical Search Techniques for Identi cation of fuzzy Models. EUFIT'97, 5th European Congress on Intelligent Techniques and Soft Computing, Aachen, Germany. Klir, George J. 1985. Architecture of Systems Problem Solving. Plenum Press, New York. Nebot, A. and Cellier, F.E. and Linkens, D.A. 1996. Synthesis of an Anaesthetic Agent AdministrationSystem Using Fuzzy Inductive Reasoning. Arti cial Intelligence in Medicine Vol. 8. Pearl, Judea 1984. Heuristics. Intelligent search strategies for computer problem solving. Addison-Wesley, Massachusetts. Van Welden, D. and Vansteenkiste, G.C. 1996. Sub{optimal Mask Search in SAPS. International Journal of General Systems Vol 24 Num. 1-2, pp. 137 - 150. Winston, Patrick Henrry 1984. Arti cial Intelligence. Addison-wesley publishing company, Menlo Park, California.

Suggest Documents