On-line modelling based on Genetic Programming - CiteSeerX

3 downloads 45757 Views 4MB Size Report
GP for on-line fault diagnosis is described, and finally test results using ... Automotive Control, Model Predictive Control and Systematic Design. Assessment. Michael .... operates on a set of training data with measured signals. (X1,…,XN).
On-line modelling based on Genetic Programming Stephan Winkler, Hajrudin Efendic, Luigi Del Re Institute for Design and Control of Mechatronical Systems, Johannes Kepler University Linz, Austria E-mail: [email protected], [email protected], [email protected]

Michael Affenzeller, Stefan Wagner Institute for Formal Models and Verification, Johannes Kepler University Linz, Austria E-mail: [email protected], [email protected] Abstract: Genetic Programming, a heuristic optimization technique based on the theory of Genetic Algorithms, is a method successfully used to identify nonlinear model structures by analyzing a system’s measured signals. Mostly, it is used as an offline tool which means that structural analysis is done after collecting all available identification data. In this paper, we propose an enhanced on-line GP approach that is able to adapt its behaviour to new observations while the GP process is executed. Furthermore, an approach using GP for on-line fault diagnosis is described, and finally test results using measurement data of NOx emissions of a BMW diesel engine are discussed. Keywords: Genetic Programming, Data Driven Model Identification, SelfAdaption, Machine Learning, On-Line Modelling, Fault Isolation. Reference to this paper should be made as follows: Winkler, S., Efendic, H., Affenzeller, M., Del Re, L. and Wagner, S. (2004) ‘On-line modelling based on Genetic Programming’, XXXX, Vol. x, No. x, pp. xxx-xxx. Biographical notes: Stephan Winkler received his MSc from Johannes Kepler University, Linz, Austria. He is now a Research Associate at the Institute of Design and Control of Mechatronical Systems at Johannes Kepler University, Linz, Austria. His research interests include Genetic Programming, Nonlinear Model Identification and Machine Learning. Hajrudin Efendic received his MSc from the University of Sarajevo and is also a Research Associate at the Institute of Design and Control of Mechatronical Systems at Johannes Kepler University. He is interested in Control Oriented Design and Fault Diagnosis.

Luigi Del Re is Professor of the Institute of Design and Control of Mechatronical Systems at Johannes Kepler University, Linz, Austria. His main research interests are Nonlinear Control, Parameter and Structure Identification, Automotive Control, Model Predictive Control and Systematic Design Assessment. Michael Affenzeller has published several papers and journal articles dealing with theoretical aspects of Genetic Algorithms and Evolutionary Computation in general. He is a member of the Institute of Formal Models and Verification at Johannes Kepler University, Linz, Austria, where he currently holds the position of an Associate Professor. Stefan Wagner received his MSc from Johannes Kepler University, Linz, Austria and is now a Research Associate at the Institute of Formal Models and Verification. His research interests include Heuristic Optimisation, Machine Learning and Software Development.

1 INTRODUCTION

The problem of finding a model for a system’s measured signal, i.e. discovering the mathematical relationship between empirically observed variables measuring a system, is an important problem in technical fields such as mechatronics, but also in economics and other areas of science [1]. In practice, the observed data may be noisy and there may be no known way to express the relationships involved in a precise way. Problems of this type are of scientific interest in the context of systems control and data mining; usually they are called symbolic system identification problems, black box problems or modelling problems. In this paper we present an on-line structure identification method based on a Genetic Programming identification approach. This identification algorithm is able to adapt its behaviour during its execution to new data, i.e. to a changing of the algorithm’s environment. By doing so, this method combines the advantages of enhanced, hybrid variants of Genetic Algorithms and heuristic optimization techniques with real-time knowledge discovery and data mining. Evolutionary computing is the collective name for heuristic problem solving techniques based on the principles of biological evolution, which are natural selection and genetic inheritance. One of the greatest

advantages of these techniques is that they can be applied to a variety of problems, ranging from leadingedge scientific research to practical applications in industry and commerce; by now, evolutionary algorithms are in use in various disciplines like optimization, artificial intelligence, machine learning, simulation of economic processes, computer games or even sociology. The forms of evolutionary computation relevant for the work described in this paper are Genetic Algorithms (GA) and Genetic Programming (GP). The fundamental principles of GAs were first presented by Holland [2], overviews about GAs and their implementation in various fields were given for instance by Goldberg [3], Michalewicz [4] and Affenzeller [5]. A GA works with a set of solution candidates (also known as individuals) called population. During the execution of the algorithm each individual has to be evaluated, which means that a value indicating the quality is returned by a fitness function. New individuals are created on the one hand by combining the genetic make-up of two solution candidates (this procedure is called “crossover”), producing a new “child” out of two “parents”, and on the other hand by mutating some individuals, which means that randomly chosen parts of genetic information are changed

(normally a minor ratio of the algorithm’s population is mutated in each generation). Beside crossover and mutation, the third decisive aspect of Genetic Algorithms is selection. In analogy to biology this is a mechanism also called “survival of the fittest”. Each individual is associated with a fitness value, and an individual’s probability to propagate its genetic information to the next generation is proportional to its fitness: The better a solution candidate’s fitness value, the higher the probability, that its genetic information will be included in the next generation’s population. This procedure of crossover, mutation and selection is repeated over many generations until some termination criterion is fulfilled.

Population of Programs

(or minimizes) some fitness function: A population of solution candidates evolves through many generations towards a solution using certain evolutionary operators and a “survival-of-the-fittest” selection scheme. The main difference is that, whereas GAs are intended to find an array of characters or integers representing the solution of a given problem, the goal of a GP process is to produce a computer program (or a formula) solving the optimization problem at hand. Figure 1 visualizes how the GP cycle works: As in every evolutionary process, new individuals are created. They are tested, and the fitter ones in the population succeed in creating children of their own. Unfit ones “die” and are removed from the population [6]. 2 GP BASED STRUCTURE IDENTIFICATION

Select Parents in Proportion to their Fitness Test Programs

Create new Programs

Figure 1: The GP cycle, taken from [6] The basic idea of Genetic Programming, which was first explored in depth by Koza [7], is that virtually all problems in artificial intelligence, machine learning, adaptive systems, and automated learning can be recast as a search for a computer program, and that GP provides a way to successfully conduct the search for a computer program in the space of computer programs. Similar to GAs, GP works by imitating aspects of natural evolution to generate a solution that maximizes

Preliminary work for the approach presented in this paper was done for the project “Specification, Design and Implementation of a Genetic Programming Approach for Identifying Nonlinear Models of Mechatronic Systems” in the context of a strategical project at the Johannes Kepler University Linz, Austria. The goal of this project was to find models for mechatronic systems. It was successfully shown (for instance in [8] and in further detail in [9]) that methods of GP are suitable for determining an appropriate mathematical representation of a physical system. Furthermore, in [10] we have documented that this approach can also be used for solving classification problems. We have used the methods implemented for this project for developing a GP-based on-line structure identification algorithm. This algorithm operates on a set of training data with measured signals (X1,…,XN). One of these signals (Xt) has to represent the system’s signal for which a model has to be found for. On the basis of the training data, the algorithm tries to evolve (or, as one could also say, to “learn”) a solution, i.e. a formula, that represents the function which models the chosen target channel. In other words, each presented instance of the structure identification problem is interpreted as an instance of

an optimization problem; a solution is found by a function since it is independent of the number of heuristic optimization algorithm. The goal of this GP considered samples. process is to produce an algebraic expression

~ X t = f ( X 1 ,..., X N )

Parent1 +

(1)

approximating Xt as well as possible (only on the basis of a database containing the measured results of the experiments to be analyzed). Thus, the GP algorithm works with solution candidates that are tree structure representations of symbolic expressions. When the evolutionary algorithm is executed, each individual of the population represents one structure tree1. In general, both crossover and mutation processes are applied to randomly chosen branches (in this context a branch is the part of a structure lying below a given point in the tree). Crossing two trees means randomly choosing a branch in each parent tree and replacing the branch of the tree, that will serve as the root of the new child (randomly chosen, too), by the branch of the other tree. Mutation in the context of Genetic Algorithms means modifying a solution candidate randomly and so creating a new individual. In the case of identifying structures, mutation works by choosing a node and changing it: A function symbol could become another function symbol or be deleted, the value of a constant node could be manipulated or the index or the time-offset of a variable could be modified. This procedure is less likely to improve a specific structure but it supports the optimization algorithm to reintroduce genetic diversity in order to re-stimulate genetic search. Examples of genetic operations on tree structures are shown in Figure 2. For evaluating solution candidates, several functions are possible. For the on-line approach presented here we have decided to use the average squared error 1

Details of the basic structure of these formula trees, their implementation and several considerations regarding the function library (since the selection of the library functions is an important part of any GP modelling process because this library should be able to represent a wide range of systems [11]) can be found in [8] and [9].

-

/ 5.0

ln x1(t-1)

*

x2(t-2)

x3(t-2)

Parent2 1.5

x2(t-2)

Crossover

+

Child1

/ 5.0

* x1(t-1)

x3(t-2)

x2(t-2)

Mutation + Child2 /

x3(t-2) +

5.0

x1(t-1)

Child3

5.0

* x1(t-1)

x3(t-2)

x2(t-2)

Figure 2: Exemplary genetic operations on tree structures: The crossover of parent1 and parent2 yields child1, child2 and child3 are possible mutants of child1

3 ON-LINE GP IDENTIFICATION

Thanks to the fact that the GP process is executed periodically, the insertion of an additional stage can be designed and implemented quite easily. As is graphically shown in Figure 3, we have added an additional phase to the standard GP cycle: Before the next generation of solution candidates is produced, possibly available new data are collected from a

predefined data source (e.g., a file as in the case of our prototypical implementation). One of the major advantages of this approach is that the benefits of Evolutionary Computation (namely the combination of directed and undirected search strategies as well as the use of a certain amount of randomness) are combined with concepts of on-line knowledge discovery and data mining. As described in further detail in the following section, this modelling method can be used as an alternative to existing on-line modelling and identification methods that are for example used in industrial fault detection and identification programs.

Knowledge about the Model

Experimental Design, Data Collection

GP Function Library

Fetch New Data if Available

GP Algorithm

GP Results Expert Analysis, Validation

Figure 3: Workflow of the on-line GP process With respect to the measured data, the algorithm is able to adapt its behaviour as new identification data are available: Since all individuals of a GP algorithm’s population have to be evaluated every generation, the corresponding data set can be modified after every generation step. This of course means a change of the algorithm’s environment and is likely to influence the

GP process in several (maybe unforeseen) ways. But since structural identification anyway assumes an underlying concept of the investigated system, this changing of environment is expected to have rather positive than negative effects. Last, but surely not least we strongly take advantage of the fact that instead of using standard implementations of the Genetic Algorithm as underlying GP algorithm, a new generic evolutionary algorithm, the SASEGASA [12], is applied. This hybrid GA uses an enhanced selection model which is designed to directly control genetic drift within the population by advantageous self-adaptive selection pressure steering. Additionally, this new selection model enables to detect and combat premature convergence which is generally quite a critical issue in GAs. A very essential question about the general performance of GAs and especially GP is, whether or not good parents are able to produce children of comparable or even better fitness. In natural evolution, this is almost always true. For GAs, this property is not so easy to guarantee and for GP it is a matter of principle that many crossover and mutation results cause counterproductive solution candidates. In order to overcome this drawback, the basic idea of the new selection model which we have called offspring selection (cf. Figure 4) is to consider not only the fitness of the parents in order to produce a child for the ongoing evolutionary process. Additionally, the fitness value of the evenly produced child is compared with the fitness values of its own parents. Basically the child is accepted as a candidate for the further evolutionary process if and only if the reproduction operator was able to produce a child that could outperform the fitness of its own parents. For further details about the more specific parameters and functioning of offspring selection the interested reader is referred to [13]. This strategy guarantees that evolution is continued mainly with crossover results that were able to mix the properties of their parents in an advantageous way which is a very essential aspect for the preservation of

essential genetic information stored in many individuals (which might not be the fittest in the sense of individual fitness). As elaborate test series have shown [9, 10, 12], the results obtained for various different optimization problems using the SASEGASA were significantly better than those produced by standard GA implementations.

Figure 4: Embedding the new selection principle into a GA or GP Actually, there are several projects and publications that also mention the use of Genetic Programming techniques for on-line structure identification. For example, GP based approaches have been designed for Gene Function Identification (as for example described by [15]) or Robot Control and Robot Vision (see for instance [16] or [17]). Still, those methods either use GP for off-line training and then adjust parameters during execution or are simply tuned to very specific problem situations. The approach presented here, in fact, is not restricted to any specific problem situation but can be used for any kind of data driven on-line identification process since a wide range of mathematical expressions can be represented and the framework used is very flexible and not tuned to any specific application. Evolutionary techniques (especially GAs and GP) have often been and are still often considered not

suitable for on-line identification: “For an off-line process, a Genetic Programming method could be utilized to ‘evolve’ the function that best represents the system dynamics. This is an attractive approach because the actual structure of the dynamic equations would be revealed (and the parameters optimized in the process). Unfortunately, evolutionary programming techniques are ill-suited for on-line learning.” [18] As we demonstrate in Section 5, this widespread opinion has to be reconsidered since the proposed GP-based method is indeed suitable for evolving suitable models (at least for mechatronical systems) on-line. Nevertheless, the authors are aware of the fact that the proposed method cannot operate “real-time” in the sense of responding to stimuli within some small upper limit of response time (as, e.g., milli- or microseconds). Due to the fact that the acquisition of new measured data can only be performed after completing a whole generation step of the GP process2, any GP based method can respond to inputs surely not within milli- or microseconds, but at least within seconds (depending on the time needed to compute a whole generation step).

4 ONLINE FAULT DIAGNOSIS USING GP

Online Fault Diagnosis (FD) in complex industrial processes using a Models-on-Demand approach is described here as a possible application scenario of GP based on-line modelling. Early fault diagnosis is an issue for many industrial or commercial applications. The reasons for that can be different. In general, supervision would be useful for any plant; some industrial processes need early fault diagnosis for efficient preventive maintenance, others need fault diagnosis to increase the process efficiency or the quality of products. In some industrial branches, for example for some components in automotive 2

This is simply because otherwise individuals that have to be compared with each other always have to be evaluated on the basis of the same environmental conditions, i.e. the same target and input signal values.

industry, fault diagnosis has become legally regulated [19, 20]. A complex industrial system usually comprises several hundred sensors and measurement devices. In such system there is frequently need for a fast and precise online FD. An attempt to fulfill both demands, namely fast and accurate online fault diagnosis, represents a significant engineering problem. In order to achieve both demands a multilayer fault diagnosis system using the Model-on-Demand approach, in which each layer is tuned to satisfy either speed or precision demands, is proposed. Data Stream

Model on Demand

Fast Fault Detection Precise Fault Detection

Fault Isolation

Figure 5: Multilayer Fault Diagnosis using MoD Figure 5 schematically shows a fault diagnosis system with several levels of fault detection and fault isolation. The first (fast) fault detection block is designed to be simple and fast; its goal is to run continuously, to analyze all available data from the data stream and to minimize the number of missed detections, but having certain number of overdetections. The second (precise) fault detection block is triggered by fault detection(s) from the first FD block. The fault detection statement from the first FD block is then to be refined in the second FD block, with the goals to “filter” as many over-detections from the first block as possible and then to focus on the measurement channels which are likely to contain a measurement or process fault. The fault detections which are still present after the second FD step trigger the fault isolation block, which produces the Fault Isolation (FI) statement. In order to have graded fault

detection, a Model-on-Demand approach for modelling for fault detection, which is parameterized using the characteristics of the measurement system, is proposed. The use of Model on Demand (MoD) for fault detection is focused on generating models with sufficient model quality. Furthermore, Model on Demand can be used also for the sharpening of the Fault Detection and Isolation (FDI) statement by generation of orthogonal models. An important step in fault diagnosis is the modelling of the process (or selected parts of the process) in order to use those models for FD. Three basic groups of models are commonly used: Analytical models, expert knowledge and data-based models. Analytical models always have a theoretical background which makes them universally applicable. Knowledge based models represent collected expert knowledge about processes or parts of processes, and when coded in expert systems they are appropriate for use in a fault diagnosis system. Finally, data based models approximate processes or parts of processes using measured information related to them. For numerous applications, the question of on-line fault diagnosis which involves an automatic on-line modelling procedure is of great importance. The concept of automatic modelling for failure detection, as proposed for example by Schrempf et al. [21] and Efendic et al. [22], can be executed using data based models. Figure 6 illustrates the proposed concept which consists of three steps: information sorting, parallelization and model evaluation. Information Sorting

Evaluation

Parallelisation

Figure 6: Basic steps of data processing Most model-based fault diagnosis methods rely on the idea of analytical redundancy [23]. Analytical redundancy is applied in the way that sensor measurements are compared to analytically calculated values of the respective variables. The idea can be extended to the comparison of analytically generated quantities only, each one obtained through a different

computation [24]. This artificial redundancy can be achieved through parallel modelling in two ways: generating models of a target variable in parallel using different modelling methods as well as generating models with different structures of a target variable, but always using only single modelling methods. For the proposed parallel FD approach it is necessary to implement a so-called discriminator whose target should be to categorize the modelling procedure and statements of preliminary plausibility analysis. Two discrimination scenarios are possible for the proposed fault diagnosis approach:

limit for quality of models to be used in the fault diagnosis process. This means that the use of several models in parallel, when all of them comply with the predefined minimum design requests, is not to be avoided. Such an approach has several advantages, most important among them being able to make artificial redundancy necessary for the process of triggered FI which follows the fault detection [25]. The second type of discrimination however has the goal to combine evaluation statements of preliminary plausibility check algorithms in order to improve detection- and over-detection rates.

Historic Data Expert Knowledge

Data-based Models

Data Expert Knowledge

Analytical Models

Data-based Models

Analytical Models

Model Pool

Model Pool Fault Detection

Model Discriminator Incoming Data

Fault Detection Fault Isolation

Figure 7: Fault Isolation using Model Discrimination •

Discrimination between different types of models before fault detection as the first step in a fault diagnosis process (as shown in Figure 7). This discrimination then has to be based on model qualities (i.e., quantitative criteria related to different models).



Discrimination between different fault detection statements after fault detection (as illustrated in Figure 8). This discrimination has to include both qualities of models and fault detection statements from each of them.

The goal of the first type of discrimination does not necessarily have to be the identification of only one, “the best available” model, but rather to set a lower

Incoming Data

Fault Detection Discriminator Fault Isolation

Figure 8: Fault Isolation using Fault Detection Discrimination The diversity of existing data based modelling methods (some of them being local and global correlation models, local and global regression models, static and dynamic fuzzy expert systems, and neural network models) allows a good modelling of processes independently of the point of view of linearity, dynamics, disturbances, etc. Although analytical models (and sometimes also expert knowledge models) are generally considered as more exact, data-based modelling methods have an important advantage: Their generation does not require any advance knowledge about the modelled process. But if different types of data based models are analyzed, several disadvantages can arise. For example, even fuzzy and

neural network models have the advantage of modelling both linear and nonlinear processes in a good way, their important disadvantage is that in general fuzzy and neural network models can not be presented explicitly. In some industrial applications, however, the explicit representation of the models for fault diagnosis can be as important as the fault diagnosis itself since it can allow uncovering and identification of previously unknown relationships within the observed industrial process. The GP modelling approach proposed in this paper combines the advantages of on-line model generation out of measurement data without using any (expert) knowledge about the modelled process, with those of producing models which can be represented in an explicit form. In a parallel automatic modelling approach for fault diagnosis, GP can be used in addition to already existing modelling methods. Furthermore, due to the stochastic nature and the randomness of GP, independent modelling runs, even with the same inputs, can result in models with (at least partially) different structures and input signals. Thus, GP structure identification can be used for realizing fault diagnosis based on analytical redundancy; examples sustaining this hypothesis are discussed in the following section.

5 EVALUATION AND EXPERIMENTAL RESULTS

regression problems [8, 9, 10]3. During all these test series (and also for testing the propose on-line learning method), the HeuristicLab [14], a generic and extensible optimization framework developed by members of the Institute of Systems Theory at the University of Linz, Austria, was used as underlying basic framework. For testing the presented on-line learning GP algorithm we have analyzed the data representing several signals of a BMW M47D diesel engine (with activated exhaust recirculation). The goal was to identify a model for the engine’s NOx emissions using the measured values of several other engine parameters (such as temperatures, pressures or the position of the throttle control). Additionally, information about other emissions (mainly CO and CO) and the throttle control should not be incorporated in the model because of redundancies and relatively high costs of exhaust sensors. A whole FTP 75 cycle was performed within approximately 1,400 seconds; all sensor signals (in total 33) were recorded with 20 Hz resolution, for the GP identification algorithm the data was downsampled to 5 Hz resolution. For simulating an on-line learning scenario, initially only 50 samples are inserted into the algorithm’s data pool adding one more every 0.2 seconds. Since the data basis available to the identification algorithm grows constantly during the simulation causing runtime problems, the identification data was restricted to the most recent 500 samples (representing 100 seconds). As underlying GP algorithm the SASEGASA was applied working with a population size of 300 individuals, 5% mutation rate and a combination of Random Selection and Roulette Selection as selection operator with generational replacement. The average of squared errors was chosen as fitness function.

Empirical studies with different problem classes and instances are the most effective way to analyze the potential of heuristic optimization searches like Evolutionary Algorithms. In our experiments, all computations were performed on a Pentium 4 PC with 1 GB RAM under Windows XP; the programs are written in the C# programming language using the Microsoft .NET framework 1.1. As already mentioned in the previous sections, the basic GP structure identification method has been tested elaborately 3 yielding surprisingly suitable models, for instance The mentioned publications as well as screenshots investigating several mechatronical systems and illustrating recent structure identification results are prepared for download from the HeuristicLab website www.heuristiclab.com.

- Original Signal - Calculated Signal

Figure 9: Best result after three minutes - Original Signal - Calculated Signal

Control of Mechatronical Systems at the University of Linz, Austria. The input channels of the model for [NOx] are the target quantity of the fuel injection pump [ME_MES16], the opacity of the engine’s emissions [OPA_OPAC] and the starting time of the fuel injection [ME_MES15] with varying coefficients and time offsets. In other words, the engine’s NOx emissions can be modelled as [NOx] ≈ f([ME_MES16], [OPA_OPAC], [ME_MES15])

(2)

This result is in fact consistent with those retrieved during previous investigations [26].

Figure 10: Best result after end of the FTP cycle

Figure 12: Another model produced for the same data

Figure 11: Identified model after the whole FTP cycle The Figures 9 and 10 illustrate the algorithm’s behaviour and graphically show evaluations of the currently best models after some minutes (Fig. 9) and at the end of the whole simulation (Fig. 10). The model that was returned by the program in the end (after finishing the whole simulation, i.e. after approximately 23 minutes), is shown in Figure 11; it was checked and rated as a very good one by experts in the field of automotive control, namely members of the automotive group of the Institute of Design and

In addition to the test run documented above we have tested the same data set several times applying the same algorithmic parameter settings. These test runs were all executed independently and produced different formulae modelling the engine’s NOx emissions. One of these models is graphically shown in Figure 12; [NOx] is modelled using the channels [ME_MES16], the fuel consumption [KWVAL] and the temperature of the coolant [TWA], again with varying coefficients and time offsets. I.e., the engine’s NOx emissions can also be modelled as [NOx] ≈ f([ME_MES16], [OPA_OPAC], [ME_MES15])

(3)

Even though this model is not quite as good as the one we have stated previously (evaluated on the whole test data set its average squared residual is approximately 16% higher), it can still be used for fault detection because it gives a good approximation of the original target values and is consistent with the results retrieved during previous investigations. Due to the fact that its set of input signals differs from the set of inputs of the model previously mentioned, these two models can be used for data-based fault diagnosis based on analytical redundancy (cf. Section 4).

6 CONCLUSION AND OUTLOOK

4 Michalewicz, Z. (1996) Genetic Algorithms + Data Structures = Evolution Programs, 3rd edn., Springer, Berlin Heidelberg New York. 5 Affenzeller, M. (2003) New Hybrid Variants of Genetic Algorithms - Theoretical and Practical Aspects, Universitätsverlag Rudolf Trauner, Linz, Austria. 6 Langdon, W. and Poli, R. (2002) Foundations of Genetic Programming, Springer, Berlin Heidelberg New York. 7 Koza, J. (1992) Genetic Programming: On the Programming of Computers by means of Natural Selection, The MIT Press, Cambridge, Mass. 8 Winkler, S., Affenzeller, M. and Wagner, S. (2004) ‘New Methods for the Identification of Nonlinear Model Structures Based Upon Genetic Programming Techniques’, Proceedings of the 15th International Conference on Systems Science, Vol. 1, pp. 386-393, Oficyna Wydawnicza Politechniki Wroclawskiej.

On the basis of evolution inspired heuristic optimization techniques, an enhanced on-line learning and model structure identification approach based on 9 Winkler, S. (2004) Identification of Nonlinear Model Genetic Programming has been presented. In addition Structures by Genetic Programming, Master Thesis, to giving a possible application of this method within Institut für Systemtheorie und Simulation, TechnischNaturwissenschaftliche Fakultät der Johannes Kepler industrial fault detection systems, we have documented Universität, Linz, Austria. how it was successfully applied to a NOx identification problem producing surprisingly good results. Since the 10 Winkler, S., Affenzeller, M. and Wagner, S. (2005) ‘Solving Multiclass Classification Problems by Genetic results for several problems are very good, even more Programming’, Proceedings of The 9th World Multichallenging ones (such as the identification of soot and Conference on Systemics, Cybernetics and Informatics. real-world multiclass classification problems, e.g.) have to be attacked. Furthermore, the use of several 11 Gray, G.J., Murray-Smith, D.J., Li, Y., Sharman, K.C. and Weinbrenner, T. (1998) ‘Nonlinear Model Structure populations in parallel and the combination of the Identification Using Genetic Programming’, Control respective identification statements via a voting Engineering Practice, Vol. 6, pp. 1341-1352. mechanism is planned to be implemented, too. 12 Affenzeller, M. and Wagner, S. (2004) ‘SASEGASA: A New Generic Parallel Evolutionary Algorithm for Achieving Highest Quality Results’, Journal of REFERENCES Heuristics - Special Issue on New Advances on Parallel Meta-Heuristics for Complex Problems, Vol. 10, pp. 1 Langley, P. et al. (1987) Scientific Discovery: 239-263, Kluwer Academic Publishers. Computational Explorations of the Creative Process, 13 Affenzeller, M. and Wagner, S. (2005) ‘Offspring The MIT Press, Cambridge, Mass. Selection: A New Self-Adaptive Selection Scheme for 2 Holland, J.H. (1975) Adaption in Natural and Artificial Genetic Algorithms’, Adaptive and Natural Computing Systems, MIT Press, Cambridge, Mass. Algorithms, Springer Computer Science, pp. 218-221. 3 Goldberg, D.E. (1989) Genetic Algorithms in Search, 14 Wagner, S. and Affenzeller, M. (2005) ‘HeuristicLab: A Optimization and Machine Learning, Addison Wesley Generic and Extensible Optimization Environment’, Longman. Adaptive and Natural Computing Algorithms, Springer Computer Science, pp. 538-541.

15 Werner, J.C. and Fogarty, T.C. (2001) ‘Genetic 26 Del Re, L., Langthaler, P., Furtmüller, C., Winkler, S. Programming Applied to Gene Function Identification’, and Affenzeller, M. (2005) ‘NOx Virtual Sensor Based Proceedings of the Seventh ACM SIGKDD International on Structure Identification and Global Optimization’, Conference on Knowledge Discovery and Data Mining. Proceedings of the SAE World Congress 2005. 16 Martin, M. (2002) ‘Genetic Programming for Real World Robot Vision’, Proceedings of the 2002 IEEE/RSJ International Conference on Intelligent Robots and System, pp. 67-72. 17 Lazarus, C. and Huosheng, H. (2001) ‘Using Genetic Programming to Evolve Robot Behaviours’, Proceedings of the 3rd British Conference on Autonomous Mobile Robotics & Autonomous Systems. 18 Ellis, J.B. (1998) An Investigation of Predictive and Adaptive Model-Based Methods for Direct Ground-toSpace Teleoperation with Time Delay, Master Thesis, Wright State University. 19 Takaagi, M. (2000) ‘Prospects of Failure Diagnostics of Automotive Electronic Control Systems’, in: On- and Off-board Diagnostics, R. K. Jurgen (ed.), SAE Inc, Warrendale, PA, pp. 1-12. 20 Bremer, W. (2000) ‘On- and Off-board Diagnostics: The role of Legislation and Standardization’, On- and Off-board Diagnostics, R. K. Jurgen (ed.), SAE Inc, Warrendale, PA. 21 Schrempf, A., Del Re, L., Groissböck, W., Lughofer, E., Klement, E.P. and Frizberg, G. (2001) ‘Automatic Engine Modeling for Failure Detection’, Proceedings of the 2001 ASME International Mechanical Engineering Congress and Exposition, pp. 1-8, New York. 22 Efendic, H., Del Re, L. and Frizberg, G. (2005) ‘Iterative Multi-Step Diagnosis Process for Engine Systems’, Proceedings of the SAE World Congress 2005, paper number: 2005-01-1055. 23 Chow, E.Y. and Willsky, A.S. (1984), ‘Analytical Redundancy and the Design of Robust Failure Detection Systems’, IEEE Transactions on Automatic Control Systems, Vol. AC 29. 24 Gertler, J.J. (1988) ‘Survey of Model-Based Failure Detection and Isolation in Complex Plants’, IEEE Control Systems Magazine, 8(6), pp. 3-11. 25 Efendic, H., Schrempf, A and Del Re, L. (2003) ‘Data Based Fault Isolation in Complex Measurement Systems Using Model on Demand’, Proceedings of the 5th IFAC Symposium on Fault Detection, Supervision and Safety of Technical Processes, Washington D.C., USA, pp. 1149-1154.

Suggest Documents