International Journal of Computational Cognition (http://www.YangSky.com/yangijcc.htm) Volume 2, Number 1, Pages 45–78, March 2004 Publisher Item Identifier S 1542-5908(03)20103-2/$20.00 Article electronically published on January 29, 2003 at http://www.YangSky.com/ijcc21.htm. Please cite this paper as: hPatricia Melin and Oscar Castillo, “Soft Computing for Intelligent Control of Nonlinear Dynamical Systems(Invited Paper)”, International Journal of Computational Cognition (http://www.YangSky.com/yangijcc.htm), Volume 2, Number 1, Pages 45–78, March 2004i.
SOFT COMPUTING FOR INTELLIGENT CONTROL OF NONLINEAR DYNAMICAL SYSTEMS(INVITED PAPER) PATRICIA MELIN AND OSCAR CASTILLO
Abstract. We describe in this paper the application of soft computing techniques to controlling non-linear dynamical systems in realworld problems. Soft computing consists of fuzzy logic, neural networks, evolutionary computation, and chaos theory. Controlling realworld non-linear dynamical systems may require the use of several soft computing techniques to achieve the desired performance in practice. For this reason, several hybrid intelligent architectures have been developed. The basic idea of these hybrid architectures is to combine the advantages of each of the techniques involved in the intelligent system. Also, non-linear dynamical systems are difficult to control due to the unstable and even chaotic behaviors that may occur in these systems. The described applications include robotics, aircraft systems, biochemical reactors, and manufacturing of batteries. Copyc right °2003 Yang’s Scientific Research Institute, LLC. All rights reserved.
1. Introduction We describe in this paper the application of soft computing techniques and fractal theory to the control of non-linear dynamical systems [8]. Soft computing consists of fuzzy logic, neural networks, evolutionary computation, and chaos theory [23]. Each of these techniques has been applied successfully to real world problems. However, there are applications in which one of these techniques is not sufficient to achieve the level of accuracy and Received by the editors January 15, 2003 / final version received January 27, 2003. Key words and phrases. Soft computing, fuzzy logic, neural networks, genetic algorithms. We would like to thank the research grant committee of CONACYT-Mexico, for the financial support given to this research project, under grant 33780-A, and also COSNET for the research grants 743.99-P, 414.01-P and 487.02-P. We would also like to thank the Department of Computer Science of Tijuana Institute of Technology for the time and resources given to this project. c °2003 Yang’s Scientific Research Institute, LLC. All rights reserved.
45
46
MELIN AND CASTILLO
efficiency needed in practice. For this reason, is necessary to combine several of these techniques to take advantage of the power that each technique offers. We describe several hybrid architectures that combine different soft computing techniques. We also describe the development of hybrid intelligent systems combining several of these techniques to achieve better performance in controlling real dynamical systems. We illustrate these ideas with applications to robotic systems, aircraft systems, biochemical reactors, and manufacturing systems. Each of these problems has its own characteristics, but all of them share in common their non-linear dynamic behavior. For this reason, the use of soft computing techniques is completely justified. In all of these applications, the results of using soft computing techniques have been better than with traditional techniques. 2. Neural Network Models A neural network model takes an input vector X and produces and output vector Y . The relationship between X and Y is determined by the network architecture [23]. There are many forms of network architecture (inspired by the neural architecture of the brain). The neural network generally consists of at least three layers: one input layer, one output layer, and one or more hidden layers. Figure 1 illustrates a neural network with p neurons in the input layer, one hidden layer with q neurons, and one output layer with one neuron. Output
Hidden
Input
1
1
j
2
q+1
q
i
p+1
Figure 1. Single hidden layer feedforward neural network.. In the neural network we will be using, the input layer with p+1 processing elements, i.e., one for each predictor variable plus a processing element for the bias. The bias element always has an input of one, Xp+1 = 1. Each processing element in the input layer sends signals Xi (i = 1,. . . , p + 1) to
SOFT COMPUTING FOR INTELLIGENT CONTROL
47
each of the q processing elements in the hidden layer. The q processing elements in the hidden layer (indexed by j = 1,. . . , q) produce an “activation” aj = F (Σwij Xi ) where wij are the weights associated with the connections between the p + 1 processing elements of the input layer and the jth processing element of the hidden layer. Once again, processing element q + 1 of the hidden layer is a bias element and always has an activation of one, i.e. aq+1 = 1. Assuming that the processing element in the output layer is linear, the network model will be Ãp+1 ! p+1 p+1 X X X (1) Yt = πj xjt + θj F wij xit . j=1
j=1
i=1
Here πj are the weights for the connections between the input layer and the output layer, and θj are the weights for the connections between the hidden layer and the output layer. The main requirement to be satisfied by the activation function F (·) is that it be nonlinear and differentiable. Typical functions used are the sigmoid, hyperbolic tangent, and the sine functions, i.e. 1 ex − e−x (2) F (x) = , or F (x) = or F (x) = sine(x). 1 − e−x ex + e−x The weights in the neural network can be adjusted to minimize some criterion such as the sum of squared error (SSE) function: n
(3)
E1 =
1X (dl − yl )2 . 2 l=1
Thus, the weights in the neural network are similar to the regression coefficients in a linear regression model. In fact, if the hidden layer is eliminated, (1) reduces to the well-known linear regression function. It has been shown [13, 24] that, given sufficiently many hidden units, (1) is capable of approximating any measurable function to any accuracy. In fact F (·) can be an arbitrary sigmoid function without any loss of flexibility. The most popular algorithm for training feedforward neural networks is the backpropagation algorithm. As the name suggests, the error computed from the output layer is backpropagated through the network, and the weights are modified according to their contribution to the error function. Essentially, backpropagation performs a local gradient search, and hence its implementation does not guarantee reaching a global minimum. A number of heuristics are available to partly address this problem, some of which are presented below. Instead of distinguishing between the weights of the different layers as in Eq. (1), we refer to them generically as wij in the following.
48
MELIN AND CASTILLO
After some mathematical simplification the weight change equation suggested by back-propagation can be expressed as follows: (4)
∆wij = −η
∂E1 + θ∆wij . ∂wij
Here, η is the learning coefficient and θ is the momentum term. One heuristic that is used to prevent the neural network from getting stuck at a local minimum is the random presentation of the training data. Another heuristic that can speed up convergence is the cumulative update of weights, i.e., weights are not updated after the presentation of each input-output pair, but are accumulated until a certain number of presentations are made, this number referred to as an “epoch”. In the absence of the second term in (4), setting a low learning coefficient results in slow learning, whereas a high learning coefficient can produce divergent behavior. The second term in (4) reinforces general trends, whereas oscillatory behavior is cancelled out, thus allowing a low learning coefficient but faster learning. Last, it is suggested that starting the training with a large learning coefficient and letting its value decay as training progresses speeds up convergence. 2.1. Levenberg-Marquardt Modifications for Neural Networks. The method of steepest descent, also known as gradient method, is one of the oldest techniques for minimizing a given function defined on a multidimensional space. This method forms the basis for many optimization techniques. In general, the descent direction is given by the second derivatives of the objective function E. The matrix of second derivatives gives us what is known as the Hessian matrix H. In classical Newton’s method this matrix is used to define an adaptation rule for a parameter vector θ as follows: (5)
θnext = θnow − H −l g,
where g is the gradient vector consisting of all the first order derivatives of function E. In Newton’s method H needs to be positive definite to have convergence. Furthermore, if the Hessian matrix is not positive definite, the Newton direction may point toward a local maximum, or a saddle point. The Hessian can be altered by adding a positive definite matrix P to H to make H positive definite. Levenberg and Marquardt [15] introduced this notion in least-squares problems. Later, Goldfeld et al. [11] first applied this concept to the Newton’s method. When P = λI, Equation (5) will be (6)
θnext = θnow − (H + λI)−l g,
SOFT COMPUTING FOR INTELLIGENT CONTROL
49
where I is the identity matrix and λ is some nonnegative value. Depending on the magnitude of A, the method transits smoothly between the two extremes: Newton’s method (λ → 0) and well-known steepest descent method (λ → ∞). A variety of Levenberg- Marquardt algorithms differ in the selection of λ. Goldfeld et al. computed eigenvalues of H and set A to a little larger than the magnitude of the most negative eigenvalue. Moreover, when λ increases, kθnext − θnow k decreases. In other words, λ plays the same role as an adjustable step length. That is, with some appropriately large λ, the step length will the right one. Of course, the step size η can be further introduced and can be determined in conjunction with line search methods: (7)
θnext = θnow − η(H + λI)−l g.
For the case of neural networks these ideas are used to update (or learn) the weights of the network [8]. 3. Fractal Dimension of a Geometrical Object Recently, considerable progress has been made in understanding the complexity of an object through the application of fractal concepts [14] and dynamic scaling theory [3]. For example, financial time series show scaled properties suggesting a fractal structure [8]. The fractal dimension of a geometrical object can be defined as follows: ln N (r) (8) d = lim r→0 ln(1/r) where N (r) is the number of boxes covering the object and r is the size of the box. An approximation to the fractal dimension can be obtained by counting the number of boxes covering the boundary of the object for different r sizes and then performing a logarithmic regression to obtain d (box counting algorithm). In Figure 2, we illustrate the box counting algorithm for a hypothetical curve C. Counting the number of boxes for different sizes of r and performing a logarithmic linear regression, we can estimate the box dimension of a geometrical object with the following equation: (9)
ln N (r) = ln β − d ln r,
this algorithm is illustrated in Figure 3. The fractal dimension can be used to characterize an arbitrary object. The reason for this is that the fractal dimension measures the geometrical complexity of objects. In this case, a time series can be classified by using the numeric value of the fractal dimension (d is between 1 and 2 because we are on the plane x-y). The reasoning behind this classification scheme is that when the boundary is smooth the fractal dimension of the object
50
MELIN AND CASTILLO
will be close to one. On the other hand, when the boundary is rougher the fractal dimension will be close to a value of two.
Figure 2. Box counting algorithm for a curve C.
Figure 3. Logarithmic regression to find dimension. We developed a computer program in MATLAB for calculating the fractal dimension of a sound signal. The computer program uses as input the figure of the signal and counts the number of boxes covering the object for different grid sizes. 4. Intelligent Control Using Soft Computing First, we describe a new method for adaptive model-based control of robotic dynamic systems using a neuro-fuzzy-fractal approach. Intelligent control of robotic dynamic systems is a difficult problem because the dynamics of these systems is highly non-linear [5]. We describe an intelligent system for controlling robot manipulators to illustrate our neuro-fuzzy-fractal approach for adaptive control. We use a new fuzzy inference system for reasoning with multiple differential equations for modelling based on the relevant parameters for the problem [6]. In this case, the fractal dimension [14] of a time series of measured values of the variables is used as a parameter for the fuzzy system. We use neural networks for identification
SOFT COMPUTING FOR INTELLIGENT CONTROL
51
and control of robotic dynamic systems [4, 21]. The neural networks are trained with the Levenberg-Marquardt learning algorithm with real data to achieve the desired level of performance. Combining a fuzzy rule base [32] for modelling with the neural networks for identification and control, an intelligent system for adaptive model-based control of robotic dynamic systems was developed. We have very good simulation results for several types of robotic systems for different conditions. The new method for control combines the advantages of fuzzy logic (use of expert knowledge) with the advantages of neural networks (learning and adaptability), and the advantages of the fractal dimension (pattern classification) to achieve the goal of robust adaptive control of robotic dynamic systems. The neuro-fuzzy-fractal approach described above can also be applied to the case of controlling biochemical reactors [21]. In this case, we use mathematical models of the reactors to achieve adaptive model-based control. We also use a fuzzy inference system for differential equations to take into consideration several models of the biochemical reactor. The neural networks are used for identification and control. The fractal dimension of the bacteria used in the reactor is also an important parameter in the fuzzy rules to take into account the complexity of biochemical process. We have very good results for several food production processes in which the biochemical reactor is controlled to optimize the production. We have also used our hybrid approach for the case of controlling chaotic and unstable behavior in aircraft dynamic systems [22]. For this case, we use mathematical models for the simulation of aircraft dynamics during flight. The goal of constructing these models is to capture the dynamics of the aircraft, so as to have a way of controlling this dynamics to avoid dangerous behavior of the system. Chaotic behavior has been related to the flutter effect that occurs in real airplanes, and for this reason has to be avoided during flight. The prediction of chaotic behavior can be done using the mathematical models of the dynamical system. We use a fuzzy inference system combining multiple differential equations for modelling complex aircraft dynamic systems. On the other hand, we use neural networks trained with the Levenberg-Marquardt algorithm for control and identification of the dynamic systems. The proposed adaptive controller performs rather well considering the complexity of the domain. We also describe in this paper, several hybrid approaches for controlling electrochemical processes in manufacturing applications. The hybrid approaches combine soft computing techniques to achieve the goal of controlling the manufacturing process to follow a desired production plan. Electrochemical processes, like the ones used in battery formation, are very complex and for this reason very difficult to control. Also, mathematical models of
52
MELIN AND CASTILLO
electrochemical processes are difficult to derive and they are not very accurate. We need adaptive control of the electrochemical process to achieve on-line control of the production line. Of course, adaptive control is easier to achieve if one uses a reference model of the process [21, 22]. In this case, we use a neural network to model the electrochemical process due to the difficulty in obtaining a good mathematical model for the problem. The other part of the problem is how to control the non-linear electrochemical process in the desired way to achieve the production with the required quality. We developed a set of fuzzy rules using expert knowledge for controlling the manufacturing process. The membership functions for the linguistic variables in the rules were tuned using a specific genetic algorithm. The genetic algorithm was used for searching the parameter space of the membership functions using real data from production lines. Our particular neuro-fuzzygenetic approach has been implemented as an intelligent system to control the formation of batteries in a real plant with very good results.
5. Intelligent Control of Robotic Systems Given the dynamic equations of motion of a robot manipulator, the purpose of robot arm control is to maintain the dynamic response of the manipulator in accordance with some pre-specified performance criterion [7]. Although the control problem can be stated in such a simple manner, its solution is complicated by inertial forces, coupling reaction forces, and gravity loading on the links. In general, the control problem consists of (1) obtaining dynamic models of the robotic system, and (2) using these models to determine control laws or strategies to achieve the desired system response and performance [10]. Among various adaptive control methods, the model-based adaptive control is the most widely used and it is also relatively easy to implement. The concept of model-based adaptive control is based on selecting an appropriate reference model and adaptation algorithm, which modifies the feedback gains to the actuators of the actual system. Many authors have proposed linear mathematical models to be used as reference models in the general scheme described before. For example a linear second-order time invariant, differential equation can be used as the reference model for each degree of freedom of the robot arm. Defining the vector y(t) to represent the reference model response and the vector x(t) to represent the manipulator response, the joint i of the reference model can be described by (10)
ai y¨i (t) + bi y˙ i (t) + yi (t) = ri (t).
SOFT COMPUTING FOR INTELLIGENT CONTROL
53
If we assume that the manipulator is controlled by position and velocity feedback gains and the coupling terms are negligible, then the manipulator equation for joint i can be (11)
αi (t)¨ xi (t) + βi (t)x˙ i (t) + xi (t) = ri (t),
where the system parameters αi (t) and βi (t) are assumed to vary slowly with time. The fact that this control approach is not dependent on a complex mathematical model is one of its major advantages, but stability considerations of the closed-loop adaptive system are critical. A stability analysis is difficult and has only been carried out using linearized models. However, the adaptability of the controller can become questionable if the interaction forces among the various joints are severe (non-linear). This is the main reason why soft computing techniques [7] have been proposed to control this type of dynamic systems. Adaptive fuzzy control is an extension of fuzzy control theory to allow the fuzzy controller, extending its applicability, either to a wider class of uncertain systems or to fine-tune the parameters of a system to accuracy [9]. In this scheme, a fuzzy controller is designed based on knowledge of a dynamic system. This fuzzy controller is characterized by a set of parameters. These parameters are either the controller constants or functions of a model’s constants. A controller is designed based on an assumed mathematical model representing a real system. It must be understood that the mathematical model does not completely match the real system to be controlled. Rather, the mathematical model is seen as an approximation of the real system. A controller designed based on this model is assumed to work effectively with the real system if the error between the actual system and its mathematical representation is relatively insignificant. However, there exists a threshold constant that sets a boundary for the effectiveness of a controller. An error above this threshold will render the controller ineffective toward the real system. An adaptive controller is set up to take advantage of additional data collected at run time for better effectiveness. At run time, data are collected periodically at the beginning of each constant time interval, tn = tn−1 + ∆t, where ∆t is a constant measurement of time, and [tn , tn−1 ) is a duration between data collection. Let Dn be a set of data collected at time t = tn . It is assumed that at any particular time, t = tn , a history of data {D0 , D1 , . . . , Dn } is always available. The more data available, more accurate the approximation of the system will become.
54
MELIN AND CASTILLO
At run time, the control input is fed into both the real system and the mathematical model representing the system. The output of the real system and the output of that mathematical model are collected and an error representing the difference between these two outputs are calculated. Let x(t) be the output of the real system, and y(t) the output of the mathematical model. The error ε(t) is defined as: (12)
ε(t) = x(t) − y(t).
Figure 4 depicts this tracking of the difference between the mathematical model and the real dynamic system it represents.
+
Controller
u(t)
Real Dynamic System
+ x(t)
ε(t)
xdesired
Mathematical Model
y(t)
Figure 4. Tracking the error function between outputs of a real system and mathematical model. An adaptive controller will be adjusted based on the error function ε(t). This calculated data will be fed into either the mathematical model or the controller for adjustment. Since the error function ε(t) is available only at run time, an adjusting mechanism must be designed to accept this error as it becomes available, i.e., it must evolve with the accumulation of data in time. At any time, t = tn , the set of calculated data in the form of a time series {ε(t0 ), ε(t1 ), ..., ε(tn )} is available and must be used by the adjusting mechanism to update appropriate parameters. In normal practice,instead of doing re-calculation based on a lengthy set of data, the adjusting algorithm is reformulated to be based on two entities: (i) sufficient information, and (ii) newly collected data. The sufficient information is a numerical variable representing the set of data {ε(t0 ), ε(t1 ), ..., ε(tn−1 )} collected from the initial time t0 to the previous collecting cycle starting at time t = tn−1 . The new datum ε(tn ) is collected in the current cycle starting at time t = tn .
SOFT COMPUTING FOR INTELLIGENT CONTROL
55
An adaptive controller will operate as follows. The controller is initially designed as a function of a parameter set and state variables of a mathematical model. The parameters can be updated any time during operation and the controller will adjust itself to the newly updated parameters. The time frame is usually divided into a series of equally spaced intervals {[tn , tn+1 )|n = 0, 1, 2, ...; tn+1 = tn + ∆t}. At the beginning of each time interval [tn , tn+1 ) observable data are collected and the error function ε(tn ) is calculated. This error is used to calculate the adjustment in the parameters of the controller. New control input u(tn ) for the time interval [tn , tn+1 ) is then calculated based on the newly calculated parameters and fed into both the real dynamic system under control and the mathematical model upon which the controller is designed. This completes one control cycle. The next control cycle will consist of the same steps repeated for the next time interval [tn+1 , tn+2 ), and so on. 5.1. Mathematical Modelling of Robotic Dynamic Systems. We will consider, in this section, the case of modelling robotic manipulators [5]. The general model for this kind of robotic system is the following: (13)
M (q)¨ q + V (q, q)) ˙ q˙ + G(q) + Fd q˙ = τ, n
where q ∈ R denotes the link position, M (q) ∈ Rn×n is the inertia matrix, V (q, q) ˙ ∈ Rn×n is the centripetal-Coriolis matrix, G(q) ∈ Rn represents the gravity vector, Fd ∈ Rn×n is a diagonal matrix representing the friction term, and τ is the input torque applied to the links. We show in Figure 5 the case of the two-link robot arm. In this figure, we show the variables involved. For the simplest case of a one-link robot arm, we have the scalar equation: (14)
Mq q¨ + Fd q˙ + G(q) = τ.
If G(q) is a linear function (G = N q), then we have the “linear oscillator” model: q¨ + aq˙ + bq = c Fd N where a = M , b = and c = Mτ q . This is the simplest mathematical Mq q model for a one-link robot arm. More realistic models can be obtained for more complicated functions G(q). For example, if G(q) = N q 2 , then we obtain the “quadratic oscillator” model: (15)
q¨ + aq˙ + bq 2 = c,
where a, b and c are defined as above. A more interesting model is obtained if we define G(q) = N sin(q). In this case, the mathematical model is (16)
q¨ + aq˙ + b sin(q) = c
56
MELIN AND CASTILLO
Figure 5. Two-link robot arm indicating the variables involved. where a, b and c are the same as above. This is the so-called “sinusoidally forced oscillator”. More complicated models for a one-link robot arm can be defined similarly. For the case of a two-link robot arm, we can have two simultaneous differential equations as follows: q¨1 + a1 q˙1 + b1 q22 = c1 (17)
q¨2 + a2 q˙2 + b2 q12 = c2 .
which is called the “coupled quadratic oscillators” model. In Equation (17) a1 , b1 , a2 , b2 , c1 and c2 are defined similarly as in the previous models. We can also have the “coupled cubic oscillators” model: q¨1 + a1 q˙1 + b1 q23 = c1 , (18)
q¨2 + a2 q˙2 + b2 q13 = c2 .
5.2. Simulation Results. To give an idea of the performance of our neurofuzzy approach for adaptive model-based control of robotic systems, we show below simulation results obtained for a single-link robot arm. The desired trajectory for the link was selected to be (19)
qd = t sin(2.0t),
and the simulation was carried out with the initial values: q(0) = 0.1q˙1 (0) = 0. We used three-layer neural networks (with 15 hidden neurons) with the Levenberg-Marquardt algorithm and hyperbolic tangent sigmoidal functions as the activation functions for the neurons. We show in Figure 6(a) the function approximation achieved with the neural network for control after 9
SOFT COMPUTING FOR INTELLIGENT CONTROL
57
epochs of training with a variable learning rate. The identification achieved by the neural network can be considered very good because the error has been decreased to the order of 10−4 . We show in Figure 6(b) the curve relating the sum of squared errors SSE against the number of epochs of neural network training. We can see in this figure how the SSE diminishes rapidly from being of the order of 102 to smaller value of the order of 10−4 . Still, we can obtain a better approximation by using more hidden neurons or more layers. In any case, we can see clearly how the neural networks learns to control the robotic system, because it is able to follow the arbitrary desired trajectory. We show in Figure 7(a) the non-linear surface for the fuzzy rule base for modelling. The fuzzy system was implemented in the fuzzy logic toolbox of MATLAB [25]. We show in Figure 7(b) the reasoning procedure for specific values of the fractal dimension and number of links of the robotic system. In Figure 8 we show simulation results for a two-link robot arm with a model given by two coupled second order differential equations. Figure 8(a) shows the behavior of position q1 and Figure 8(b) shows it for position q2 of the robot arm. We can see from these figures the complex dynamic behavior of this robotic system [7]. Of course, the complexity is even greater for higher dimensional robotic systems. We have very good simulation results for several types of robotic manipulators for different conditions. The new method for control combines the advantages of neural networks (learning and adaptability) with the advantages of fuzzy logic (use of expert knowledge) to achieve the goal of robust adaptive control of robotic dynamic systems. We consider that our method for adaptive control can be applied to general non-linear dynamical systems [8, 27] because the hybrid approach, combining neural networks and fuzzy logic, does not depend on the particular characteristics of the robotic dynamic systems. The new method for adaptive control can also be applied for autonomous robots [8], but in this case it may be necessary to include genetic algorithms for trajectory planning. 6. Control of Biochemical Reactors Process control of biochemical plants is also an attractive application because of the potential benefits to both adaptive network research and to actual biochemical process control. In spite of the extensive work on selftuning controllers and model-reference control, there are many problems in chemical processing industries for which current techniques are inadequate. Many of the limitations of current adaptive controllers arise in trying to
58
MELIN AND CASTILLO
(a)
(b) Figure 6. (a) Function approximation after 9 epochs, (b) SSE of the neural network.
SOFT COMPUTING FOR INTELLIGENT CONTROL
(a)
(b) Figure 7. (a) Non-linear surface for modelling, (b) fuzzy reasoning procedure.
59
60
MELIN AND CASTILLO
(a)
(b) Figure 8. (a) Simulation of position q1 , (b) Simulation of position q2 .
SOFT COMPUTING FOR INTELLIGENT CONTROL
61
control poorly modeled non-linear systems [1]. For most of these processes extensive data are available from past runs, but it is difficult to formulate precise models. This is precisely where adaptive networks are expected to be useful [31]. Bioreactors are difficult to model because of the complexity of the living organisms in them and also they are difficult to control because one often can’t measure on-line the concentration of the chemicals being metabolized or produced. Bioreactors can also have markedly different operating regimes, depending on whether the bacteria is rapidly growing or producing product. Model-based control of these reactors offers a dual problem: determining a realistic process model and determining effective control laws in the face of inaccurate process models and highly nonlinear processes [19, 20, 26]. Biochemical systems can be relatively simple in that they have few variables, but still very difficult to control due to strong nonlinearities which are difficult to model accurately. A prime example is the bioreactor. In its simplest form, a bioreactor is simply a tank containing water and cells (e.g.. bacteria) which consume nutrients (”substrate”) and produce products (both desired and undesired) and more cells. Bioreactors can be quite complex: cells are self-regulatory mechanisms, and can adjust their growth rates and production of different products radically depending on temperature and concentrations of waste products [16]. Systems with heating or cooling, multiple reactors or unsteady operation greatly complicate the analysis. Mathematical models for these systems can be expressed as differential (or difference) equations [3, 17, 18]. Now we propose mathematical models that integrate our method for geometrical modelling of bacteria growth using the fractal dimension [14] with the method for modelling the dynamics of bacteria population using differential equations [27]. The resulting mathematical models describe bacteria growth in space and in time, because the use of the fractal dimension enables us to classify bacteria by the geometry of the colonies and the differential equations help us to understand the evolution in time of bacteria population. We will consider first the case of using one bacteria for food production. The mathematical model in this case can be of the following form:
(20)
µ ¶ N −D dN =r 1− N −D − βN −D , dt K dP = βN −D , dt
62
MELIN AND CASTILLO
where D is the fractal dimension, N is the bacteria population, P is quantity of chemical product, r is the rate of bacteria growth, K is the environment capacity, and β is a biochemical conversion factor. We will consider now the case of two bacteria used for food production:
(21)
· µ ¶ µ ¶ ¸ dN1 r1 r1 −D1 −D2 = r1 − N1 − δ12 N2 N1−D1 − βN1−D1 , dt K1 K1 · µ ¶ µ ¶ ¸ r2 dN2 r2 −D2 −D1 = r2 − N2 − δ21 N1 N2−D2 − γN2−D2 , dt K2 K2 dP = βN1−D1 + γN2−D2 , dt
where D1 is the fractal dimension of bacteria 1, D2 is the fractal dimension of bacteria 2 and the rest of variables are as described in the last equation. As we can see from equations (20) and (21) the idea of our method of modelling is to use the fractal dimension D as a parameter in the differential equations, so as to have a way of classifying for which type of bacteria the equation corresponds. In this way, equation (20), for example, can represent the model for food production using one bacteria (the one defined by the fractal dimension D). We have implemented a model-based neural controller using the architecture of Figure 9. Two multilayer networks are used, one for the model of the plant and the second for the controller. The Neural Networks were implemented in the MATLAB programming language to achieve a high level of efficiency on the numerical calculations needed for these modules. The Fractal module was also implemented in the MATLAB programming language for the same reason. In this way we combine the three methodologies to obtain the best of the three worlds (Neural Networks, Fuzzy Logic and Fractal Theory) using for each the appropriate implementation language. We show in Figure 10 simulation results of bacteria population used for food production. We can see from this figure the complicated dynamics for the case of two bacteria competing in the same environment, and at the same time producing the chemical product necessary for food production. We also show in Figure 11 simulation results for the case of two good bacteria used for food production and one bad bacteria that is attacking the other ones. We can see from this figure how one of the good bacteria is eliminated (the population goes down to zero), which of course results in a decrease of the resulting quantity of the food product. This is a case, which has to be avoided because of the bad resulting effect of the bad bacteria. Intelligent control helps in avoiding these types of scenarios for food production.
SOFT COMPUTING FOR INTELLIGENT CONTROL
Figure 9. Indirect Adaptive Neuro-Fuzzy-Fractal Control.
Figure 10. Simulation of the model for two bacteria used in food production.
63
64
MELIN AND CASTILLO
Figure 11. Simulation of the model for two good bacteria and one bad one.
We have use a general method for adaptive model based control of nonlinear dynamic plants using Neural Networks, Fuzzy Logic and Fractal Theory. We illustrated our method for control with the case of biochemical reactors. In this case, the models represent the process of biochemical transformation between the microbial life and their generation of the chemical product. We also describe in this paper an adaptive controller based on the use of neural networks and mathematical models for the plant. The proposed adaptive controller performs rather well considering the complexity of the domain being considered in this research work. We can say that combining Neural Networks, Fuzzy Logic and Fractal Theory, using the advantages that each of these methodologies has, can give good results for this kind of application. Also, we believe that our neuro-fuzzy-fractal approach is a good alternative for solving similar problems.
SOFT COMPUTING FOR INTELLIGENT CONTROL
65
7. Intelligent Control of Aircraft Systems The mathematical models of aircraft systems can be represented as coupled non-linear differential equations [22]. In this case, we can develop a fuzzy rule base for modelling that enables the use of the appropriate mathematical model according to the changing conditions of the aircraft and its environment. For example, we can use the following model of an airplane when wind velocity is relatively small: (22)
p˙ = I1 (−q + l), q˙ = I2 (p + m),
where I1 and I2 are the inertia moments of the airplane with respect to axis x and y, respectively, l and m are physical constants specific to the airplane, and p, q are the positions with respect to axis x and y, respectively. However, a more realistic model of an airplane in three dimensional space, is as follows: (23)
p˙ = I1 (−qr + l), q˙ = I2 (pr + m), r˙ = I3 (−pq + n),
where now I3 is the inertia moment of the airplane with respect to the z axis, n is a physical constant specific to the airplane, and r is the position along the z axis. Considering now wind disturbances in the model, we have the following equation: (24)
p˙ = I1 (−qr + l) − ug , q˙ = I2 (pr + m), r˙ = I3 (−pq + n),
where ug is the wind velocity. The magnitude of wind velocity is dependent on the altitude of the airplane in the following form: µ ¶ ln(r/510) ug = uwind510 1 + ln 51 where uwind510 is the wind speed at 510 ft altitude (typical value = 20 ft/sec). If we use the models of Eqs. (22)-(24) for describing aircraft dynamics, we can formulate a set of rules that relate the models to the conditions of the aircraft and its environment. Lets assume that M1 is given by Eq. (22), M2 is given by Eq. (24), and M3 is given by Eq. (24). Now using the wind velocity ug and inertia moment I1 as parameters, we can establish the fuzzy rule base for modelling [29, 30] as in Table 1. In Table 1, we are assuming that the wind velocity ug can have only two possible fuzzy values (small and large). This is sufficient to know if we have to use the mathematical model that takes into account the effect of wind (M3 ) for ug large or if we don’t need to use it and simply the model M2 is sufficient (for ug small). Also, the inertia moment (I1 ) helps in deciding between models M1 and M2 (or M3 ).
66
MELIN AND CASTILLO
Table 1. Fuzzy rule base for modelling aircraft systems. IF Wind Inertia Fractal Dim Small Small Low Small Small Medium Small Large Low Small Large Medium Large Small Medium Large Large Medium Large Large High
THEN Model M1 M2 M2 M2 M3 M3 M3
To give an idea of the performance of our neuro-fuzzy-fractal approach for adaptive control, we show below simulation results for aircraft dynamic systems. First, we show in Figure 12(a) the fuzzy rule base for a prototype intelligent system developed in the fuzzy logic toolbox of the MATLAB programming language. We show in Figure 12(b) the non-linear surface for the problem of aircraft dynamics using as input variables: fractal dimension and wind velocity. We show simulation results for an aircraft system obtained using our new method for modelling dynamical systems. In Figure 13(a) and Figure 13(b) we show results for an airplane with inertia moments: I1 = 1, I2 = 0.4, I3 = 0.05 and the constants are: l = m = n = 1. The initial conditions are: p(0) = 0, q(0) = 0, r(0) = 0. To give an idea of the performance of our neuro-fuzzy approach for adaptive model-based control of aircraft dynamics, we show below (Figure 14) simulation results obtained for the case of controlling the altitude of an airplane for a flight of 6 hours. We assume that the airplane takes about one hour to achieve the cruising altitude 30,000 ft, then cruises along for about three hours at this altitude (with minor fluctuations), and finally descends for about two hours to its final landing point. We will consider the desired trajectory as follows: 30t + sin 2t, 0 ≤ t ≤ 1, 30 + 2 sin 10t, 1 < t ≤ 4, rd = 90 − 15t, 4 < t ≤ 6. Of course, a complete desired trajectory for the airplane would have to include the positions for the airplane in the x and y directions (variables p, q in the models). However, we think that here for illustration purposes is sufficient to show the control of the altitude r for the airplane.
SOFT COMPUTING FOR INTELLIGENT CONTROL
(a)
(b) Figure 12. (a) Fuzzy rule base. (b) Non-linear surface for aircraft dynamics.
67
68
MELIN AND CASTILLO
(a)
(b) Figure 13. (a) Simulation of position q. (b) Simulation of position p.
SOFT COMPUTING FOR INTELLIGENT CONTROL
69
We used three-layer neural networks (with 10 hidden neurons) with the Levenberg-Marquardt algorithm and hyperbolic tangent sigmoidal functions as the activation functions for the neurons. We show in Figure 14 the function approximation achieved by the neural network for control after 800 epochs of training with a variable learning rate. The identification achieved by the neural network (after 800 epochs) can be considered very good because the error has been decreased to the order of 10−1 . Still, we can obtain a better approximation by using more hidden neurons or more layers. In any case, we can see clearly (from Figure 14) how the neural network learns to control the aircraft, because it is able to follow the arbitrary desired trajectory. We have to mention here that these simulation experiments for the case of a specific flight for a given airplane show very good results. We have also tried our approach for control with other types of flights and airplanes with good simulation results. Still, there is a lot of research to be done in this area because of the complex dynamics of aircraft systems.
Figure 14. Function approximation of the neural network for control of an airplane.
70
MELIN AND CASTILLO
We have developed a general method for adaptive model based control of non-linear dynamic systems using Neural Networks, Fuzzy Logic and Fractal Theory. We illustrated our method for control with the case of controlling aircraft dynamics. In this case, the models represent the aircraft dynamics during flight. We also described in this paper an adaptive controller based on the use of neural networks and mathematical models for the system. The proposed adaptive controller performs rather well considering the complexity of the domain being considered in this research work. We have shown that our method can be used to control chaotic and unstable behavior in aircraft systems. Chaotic behavior has been associated with the “flutter” effect in real airplanes, and for this reason is very important to avoid this kind of behavior. We can say that combining Neural Networks, Fuzzy Logic and Fractal Theory, using the advantages that each of these methodologies has, can give good results for this kind of application. Also, we believe that our neuro-fuzzy-fractal approach is a good alternative for solving similar problems. 8. Intelligent Control of the Battery Charging Process In a battery a process of conversion of chemical energy into electrical energy is carried out. The chemical energy contained in the electrode and electrolyte is converted into electrical power by means of electrochemical reactions. When connecting the battery to a source of direct current a flow of electrons takes place for the external circuit, and of ions inside the battery, giving an accumulation of load in the battery. The quantity of electric current that is required to load the battery is determined by an unalterable law of nature, that was postulated by Michael Faraday, which is known as the Law of Faraday [2]. Faraday found that the quantity of electric power required to perform an electrochemical change in a metal is related to the relative weight of the metal. In the specific case of lead this is considered to be 118 amperes hour for pound of positive active material for cell. In practice, more energy is required to counteract the losses due to the heat and to the generation of gas. We show in Table 2 experimental data for a specific type of battery with different sizes of the plates, and different number of plates for each cell. In this table, we show the load time and the average current needed for the respective load. In Table 2 we can observe that to form a battery we need to apply a particular current intensity during a certain amount of time to achieve the required loading for the battery The goal of the manufacturers of batteries is to reduce the time required to load the battery. However, current intensity can’t be increased arbitrarily because of the physical characteristics of the specific battery [12]. If the
SOFT COMPUTING FOR INTELLIGENT CONTROL
71
current is increased too much, the temperature in the battery will go over a safe temperature value eventually causing the destruction of the battery. 8.1. Fuzzy Method for Control. In this approach we use a statistical model to represent the electrochemical process and a fuzzy rule base for process control. The temperature in the battery depends on the electrical current that circulates in it during its formation, this means that to maintain the temperature below a specific threshold it is important to control the intensity of the current. Therefore for this case the independent variable is the average current I, and the dependent variable is the average temperature T . A simple statistical linear model can stated as follows: (25)
T = βo + β1 I,
where βo and β1 are parameters to be estimated (by least squares) using real data for this problem. In Table 3, we show experimental values for a battery of 6 Volts, which according to manufacturer’s specifications should be loaded by using 200 amperes hour. Using the data from Table 3 we can obtain (by least squares method) the values of βo and β1 [28]. The equations is as follows: (26)
T = 88.03 + 2.5304I,
with correlation value of only 0.57 which is because of the complexity of the data. For the fuzzy controller we used as input variables, the temperature T and the change of temperature dT/dt, and as output variable the current intensity that should be applied to the battery. In Figure 15 we show the architecture of our control system. T dT/dt
I Fuzzy controller
T Electro-chemical process
Figure 15. Fuzzy control of the process. The control method was implemented in the MATLAB language. For each of the linguistic variables it was considered convenient to use five terms. In Figure 16 we show the fuzzy rule base implemented in the Fuzzy Logic Toolbox of MATLAB. We have 25 rules because we are using 5 linguistic
MELIN AND CASTILLO 72
Plate cell 7 9 11 13 15 17
Table 2. Experimental data for different types of batteries. Type of Plate Positive 0.060” Negative 0.050” Positive 0.070” Negative 0.060” Total A. H. 72 hr Amp. 96 hr Amp. Total A.H. 72 hr Amp 96 hr Amp 155 2.2 1.6 165 2.4 1.8 180 2.8 2.0 200 2.8 2.2 230 3.2 2.4 245 3.4 2.4 260 3.6 2.6 295 4.0 3.0 300 4.2 3.0 345 4.8 3.6 400 5.6 4.2 415 5.8 4.4
SOFT COMPUTING FOR INTELLIGENT CONTROL
73
Table 3. Values of temperature and current for a battery of 200 amperes hour. Hrs 21:00 23:00 1:00 3:00 5:00 7:00 9:00 11:00 13:00 15:00 17:00 19:00 21:00
T 111 100 105 100 100 97 92 95 102 103 100 97 94
I 5.22 5.21 5.52 5.66 5.60 5.72 4.82 4.32 4.10 4.05 3.40 3.77 3.62
Hrs T 23:00 93 1:00 91 3:00 92 5:00 96 7:00 98 9:00 98 11:00 102 13:00 99 15:00 98 17:00 97 19:00 95 21:00 94 23:00 96
I 3.53 3.40 3.32 3.16 3.10 3.14 3.12 3.03 3.05 3.06 2.96 2.60 2.76
terms for each variable. The membership functions were tuned manually until they give the best values for the problem.
Figure 16. Fuzzy rule base for controlling the Process.
74
MELIN AND CASTILLO
8.2. Neuro-Fuzzy Method for Control. Since it is difficult to tune a particular inference system to model a complex dynamical system [1] it is convenient to use adaptive fuzzy inference systems. Adaptive neuro-fuzzy inference systems (ANFIS) can be used to adapt the membership functions and consequents of the rule base according to historical data of the problem [13]. In this case, we can use the data from Table 2 and apply the ANFIS methodology to find the best fuzzy system for our problem. We used the fuzzy logic toolbox of MATLAB to apply the ANFIS methodology to our problem with 5 membership functions and first order Sugeno functions in the consequents. We show in Figure 17 the non-linear surface for control.
Figure 17. ANFIS surface for the process.
8.3. Neuro-Fuzzy-Genetic Control. In this case, neural networks are used for modelling the electrochemical process, fuzzy logic for controlling the electrical current and genetic algorithms for adapting the membership functions of the fuzzy system [8]. A multilayer feedforward neural network was used for modelling the electrochemical process. We used the data form Table 3 and the Levenberg-Marquardt learning algorithm to train the neural network. We used a three layer neural network with 15 nodes in the hidden
SOFT COMPUTING FOR INTELLIGENT CONTROL
75
layer. The results of training for 2000 epochs are as follows. The sum of squared errors was reduced from about 200 initially to 11.25 at the end, which is a very good approximation in this case. The fuzzy rule base was implemented in the Fuzzy Logic Toolbox of MATLAB. In this case, 25 fuzzy rules were used because there were 5 linguistic terms for each input variable.
8.4. Experimental Results. The three hybrid control systems were compared by simulating the formation (loading) of a 6 Volts battery. This particular battery is manually loaded (in the plant) by applying 2 amperes for 50 hours under manufacturer’s specifications. We show in Table 4 the experimental results. Table 4. Comparison of the methods for control. Control Method Time Loading Manual Control 50 hours Conventional Control 36 hours Fuzzy Control 32 hours Neuro-Fuzzy Control 30 hours Neuro-Fuzzy-Genetic 25 hours
We can see from Table 4 that the fuzzy control method reduces 36% the time required to charge the battery compared with manual control, and 11.11% compared with conventional PID control [27]. We can also see how ANFIS helps in reducing even more this time because we are using neural networks for adapting the intelligent system. Now the reduction is of 40% with respect to manual control. Finally, we can notice that using a neuro-fuzzy-genetic approach reduces even more the time because the genetic algorithm optimizes the fuzzy system. In this case, reduction is of 50% with respect to manual control. We have described in this section, three different approaches for controlling an electrochemical process. We have shown that for this type of application the use of several soft computing techniques can help in reducing the time required to produce a battery. Even fuzzy control alone can reduce the formation time of a battery, but using neural networks and genetic algorithms reduces even more the time for production. Of course, this means that manufacturers can produce the batteries in half the time needed before.
76
MELIN AND CASTILLO
9. Conclusions We can say that hybrid intelligent systems can be used to solve difficult real-world problems. Of course, the right hybrid architecture (and combination) has to be selected. At the moment, there are no general rules to decide on the right architecture for specific classes of problems. However, we can use the experience that other researchers have gained on these problems and use it to our advantage. Also, we always have to turn to experimental work to test different combinations of soft computing techniques and decide on the best one for ourselves. Finally, we can conclude that the use of soft computing for controlling dynamical systems is a very fruitful area of research, because of the excellent results that can be achieved without using complex mathematical models [8, 23]. References [1] Albertos, P., Strietzel, R. and Mart, N. (1997). “Control Engineering Solutions: A practical approach”, IEEE Computer Society Press. [2] Bode, H., Brodd, R.J. and Kordesch, K.V. (1977). Lead-Acid Batteries, John Wiley & Sons. [3] Castillo, O. and Melin, P. (1994). “Developing a New Method for the Identification of Microorganisms for the Food Industry using the Fractal Dimension,” Journal of Fractals, 2, No. 3, pp. 457-460. [4] Castillo, O. and Melin, P. (1997). “Mathematical Modelling and Simulation of Robotic Dynamic Systems using Fuzzy Logic Techniques and Fractal Theory”, Proceedings of IMACS’97, Berlin, Germany, Vol. 5, pp. 343-348. [5] Castillo, O. and Melin, P. (1998) “A New Fuzzy-Fractal-Genetic Method for Automated Mathematical Modelling and Simulation of Robotic Dynamic Systems”, Proceedings of FUZZ’98, IEEE Press, Anchorage, Alaska, USA, Vol. 2, pp. 11821187. [6] Castillo, O. and Melin, P. (1999). “A New Fuzzy Inference System for Reasoning with Multiple Differential Equations for Modelling Complex Dynamical Systems”, Proceedings of CIMCA’99, IOS Press, Vienna, Austria, pp. 224-229. [7] Castillo, O. and Melin, P. (1999). “Automated Mathematical Modelling, Simulation and Behavior Identification of Robotic Dynamic Systems using a New Fuzzy-FractalGenetic Approach”, Journal of Robotics and Autonomous Systems, Elsevier, Vol. 28, No. 1, pp. 19-30. [8] Castillo O. and Melin, P. (2001). “Soft Computing for Control of Non-Linear Dynamical Systems”, Springer-Verlag, Heidelberg, Germany. [9] Chen, G. and Pham, T. T. (2001). “Introduction to Fuzzy Sets, Fuzzy Logic, and Fuzzy Control Systems”, CRC Press, Boca Raton, Florida, USA. [10] Fu, K.S., Gonzalez, R.C. and Lee, C.S.G. (1987). “Robotics: Control, Sensing, Vision and Intelligence”, Mc Graw-Hill. [11] S. M. Goldfeld, R. E. Quandt, and H. F. Trotter (1966). “Maximization by Quadratic Hill Climbing”, Econometrica, vol 34, pp. 541-551. [12] Hehner, N. and Orsino, J.A. (1985). Storage Battery Manufacturing Manual III, Independent Battery Manufacturers Association.
SOFT COMPUTING FOR INTELLIGENT CONTROL
77
[13] Jang, J.R., Sun, C.T. and Mizutani, E. (1997). Neuro-Fuzzy and Soft Computing, Prentice Hall. [14] Mandelbrot, B. (1987). “The Fractal Geometry of Nature”, W.H. Freeman and Company. [15] D. W. Marquardt, “An Algorithm for Least Squares Estimation of Non-Linear Parameters”, Journal of the Society of Industrial and Applied Mathematics, vol. 11, pp. 431-441, 1963. [16] Melin, P. and Castillo, O. (1996). “Modelling and Simulation for Bacteria Growth Control in the Food Industry using Artificial Intelligence”, Proceedings of CESA’96, Gerf EC Lille, Lille, France, pp. 676-681. [17] Melin, P. and Castillo, O. (1997). “An Adaptive Model-Based Neural Network Controller for Biochemical Reactors in the Food Industry”, Proceedings of Control’97, Acta Press, Canada, pp.147-150. [18] Melin P. and Castillo, O. (1997). “An Adaptive Neural Network System for Bacteria Growth Control in the Food Industry using Mathematical Modelling and Simulation”, Proceedings of IMACS World Congress’97, W & T Verlag, Berlin, Germany, Vol. 4 pp. 203-208. [19] Melin, P. and Castillo, O. (1997). “Automated Mathematical Modelling and Simulation for Bacteria Growth Control in the Food Industry using Artificial Intelligence and Fractal Theory”, Journal of Systems, Analysis, Modelling and Simulation, Gordon and Breach, pp.189-206. [20] Melin, P. and Castillo, O. (1998). “An Adaptive Model-Based Neuro-FuzzyFractal Controller for Biochemical Reactors in the Food Industry”, Proceedings of IJCNN’98, Anchorage Alaska, USA, Vol. 1, pp. 106-111. [21] Melin, P. and Castillo, O. (1998) “A New Method for Adaptive Model-Based NeuroFuzzy-Fractal Control of Non-Linear Dynamic Plants: The Case of Biochemical Reactors”, Proceedings of IPMU’98, EDK Publishers, Paris, France, Vol. 1, pp. 475-482. [22] Melin, P. and Castillo, O. (1999) “A New Method for Adaptive Model-Based NeuroFuzzy-Fractal of Non-Linear Dynamical Systems”, Proceedings of ICNPAA, European Conference Publications, Daytona Beach, USA, pp. 499-506. [23] P. Melin, and O. Castillo, ”Modelling (2002). Simulation and Control of Non-Linear Dynamical Systems”, Taylor and Francis Publishers, London, Great Britain. [24] Miller, W.T., Sutton, R.S. and Werbos, P.J. (1995). Neural Networks for Control, MIT Press. [25] Nakamura, S. (1997). Numerical Analysis and Graphic Visualization with MATLAB, Prentice-Hall. [26] Narendra, K. S. and Annaswamy, A. M. (1989). Stable Adaptive Systems, Prentice Hall Publishing. [27] Rasband, S.N. (1990). Chaotic Dynamics of Non-Linear Systems, John Wiley & Sons. [28] Sepulveda, R., Castillo, O., Montiel, O. and Lopez, M. (1998). “Analysis of Fuzzy Control System for Process of Forming Batteries”, ISRA’98, Mexico, pp. 203-210 [29] Sugeno M. and Kang, G. T. (1988). “Structure Identification of Fuzzy Model,” Fuzzy Sets and Systems, 28, pp. 15-33. [30] Takagi T. and Sugeno, M. (1985). “Fuzzy Identification of Systems and its Applications to Modelling and Control”, IEEE Transactions on Systems, Man and Cybernetics, 15 pp.116-132.
78
MELIN AND CASTILLO
[31] Ungar, L. H. (1995). A Bioreactor Benchmark for Adaptive Network-Based Process Control, Neural Networks for Control, MIT Press, pp.387-402. [32] Zadeh, L. A. (1975). “The Concept of a Linguistic Variable and its Application to Approximate Reasoning”, Information Sciences, 8, pp. 43-80. Department of Computer Science, Tijuana Institute of Technology P.O. Box 4207, Chula Vista CA, 91909, U.S.A. E-mail address:
[email protected] (P. Melin)